THE ANNALS 
of 
MATHEMATICAL 
STATISTICS 


(FOUNDED BY H. C. CARVER) 
THE OFFICIAL JOURNAL OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


VOLUME XVIII 





Math.-Econ. 
Library 


THE ANNALS 
OF MATHEMATICAL STATISTICS 


EDITED BY 
S. S. WILKS, Editor 


M. S. BARTLETT HARALD CRAMER J. NEYMAN 
WILLIAM G. COCHRAN W. EDWARDS DEMING WALTER A. SHEWHART 
ALLEN T. CRAIG J. L. DOOB JOHN W. TUKEY 
C. C. CRAIG W. FELLER A. WALD 
HAROLD HOTELLING 


WITH THE COOPERATION OF 


T. W. ANDERSON, JR. CHURCHILL EISENHART Wituram G. Mapow 
M. A. GirsHICK ALEXANDER M. Moop 
Paut R. Hatmos FREDERICK MOSTELLER 
Paut G. Hoe. Henry ScuHerrb 
Mark Kac Jacos WoLFowITz 


The ANNALS OF MaTHEMATICAL Statistics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS OF MATHEMATICAL Statistics, Mt. 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti- 
tute of Mathematical Statistics, P.S. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a given 
issue should be reported to the Secretary on or before the 15th of the 
month preceding the month of that issue. ‘The months of issue are March, 
June, September and December. 


Manuscripts for publication in the ANNALS OF MATHEMATICAL STATISTICS 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are to 
be printed. Authors are requested to keep in mind typographical difficulties 
of complicated mathematical formulae. 


Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $5.00 per year. Single copies $1.50. 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. 


CoMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
BALTIMORE, Mp., U.S. A. 





Entered as second-class matter at the Post Office at Baltimore. Maryland. under the Act of March 3, 1879 





a 


- 





way HE AN AS 
of 
MATHEMATICAL 
STATISTICS 


(FOUNDED BY H. C. CARVER) 


Tre OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 


The General Canonical Correlation Distribution. M.S. Bartuerr 
On the Theory of Markoff Chains. Exiiorr W. Montrouu 
On the First Two Moments of the Measure of a Random Set. 


On a Test of Whether One of Two Random Variables is Stochastically Larger 
than the Other. H.B. Mann and D. R. Watney 


On the Convergence of Sequences of Moment Generating Functions. W. 
KozakIEwicz ; 


A Generalization of Tshebyshev’s Inequality to Two Dimensions. Z. W. 
Brrnspaum, J. Raymonp and H.S8. ZuckerMan 


Distribution of the Serial Correlation Coefficient in a Circularly Correlated 
Universe. R. B. Lerpnrx 


Concerning the Effect of Intraclass Correlation on Certain Significance Tests. 
Joun E. WatsH 


NOTES: 


On the Studentization of Several Variances. B. L. WeicH 

Probability Schemes with Contagion in Space and Time. 
CrrnuscHi and Louis CasTaGNETTO 

Fitting Curves with Zero or Infinite End Points. 

Consistency of Sequential Binomial Estimates. 


Book Reviews 


Report on the Boston Meeting of the Institute 

Annual Report of the President of the Institute 

Annual Report of the Secretary-Treasurer of the Institute 
Annual Report of the Editor . - 

Constitution and the By-La-vs of the Institute 


Vol. XVIII, No. 1 — March, 1947 





es THE ANNALS 
“7 OF MATHEMATICAL STATISTICS 
\ 


: re Le EDITED BY 
Fae, 1 8. 8. WILKS, Editor 
.§. BARTLETT HARALD CRAMER J. NEYMAN 
WILLIAM G. COCHRAN W.EDWARDS DEMING WALTER A. SHEWHART 


C. C. CRAIG J. L. DOOB JOHN W. TUKEY 
ALLEN T. CRAIG A. WALD 


W. FELLER 
HAROLD HOTELLING 


WITH THE COOPERATION OF 


T. W. ANDERSON, JR. CHURCHILL EISENHART Wituram G. Mavow 

J. H. Curtiss M. A. GirsHick ALEXANDER M. Moop 
Pavut R. Hatmos FREDERICK MostEeLLER 
Paut G. Horn Henry Scuerrh 

Pau. 8. Dwrzr Mark Kac Jacos Wo.LFowItTz 


The Annas or Matuematicat Sraristics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS OF MaTHEMaTICAL Statistics, Mt. 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti- 
tute of Mathematical Statistics, P.S. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a given 
issue should be reported to the Secretary on or before the 15th of the 
month preceding the month of that issue. The months of issue are March, 
June, September and December. Because of post-war difficulties of publica- 
tion, issues may often be from two to four weeks late in appearing. 
Subscribers are therefore requested to wait at least $30 days after month of issue 
before making inquiries concerning non-delivery. 


Manuscripts for publication in the ANNALS oF MaTHEMATICAL STATISTICs 
should be sent to 8. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are to 
be printed. Authors are requested to keep in mind typographical difficulties 
of complicated mathematical formulae. 


Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $5.00 per year. Single copies $1.50. 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. 


CoMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
Battrmmore, Mp., U.S. A. 


Entered as second-class matter at the Post Office at Baltimore, Maryland, under the Act of March 3, 1879 











THE GENERAL CANONICAL CORRELATION DISTRIBUTION 


By M. S. BartLettr 
University of Cambridge, England and University of North Carolina 


1. Summary. ‘The general canonical correlation distribution is given as a 
multiple power series in the true canonical correlations p;. When only one 
true correlation is not zero, this series is expressible as a generalized hyper- 
geometric function, for the cases both of non-central means and of correlations 
proper. In the general case of more than one non-zero true correlation the 
coefficients in the expansion depend on the conditional moments of the sample 
correlations between the pairs of transformed variables representing the true 
canonical variables, when the sample canonical correlations between the sample 
canonical variables are fixed. Methods are given of obtaining these coefficients 
for both cases, non-central means and correlations proper; and their form up to 
the fourth order, corresponding to O(p°) in the expansion, listed in Appendix I. 
The detailed terms making up these coefficients are given, in the case of two 
non-zero correlations, up to the fourth order, and in the general case, up to the 
third order, in Appendix II. 


2. Introductory remarks; the case of zero roots. In the statistical theory of 
the relation of one vector variate with another (see Hotelling [1]), the simul- 
taneous distribution of the canonical correlations 7; , which are the roots of a 
certain determinantal equation, was first obtained in 1939 (Fisher [2], Hsu [3], 


Roy [4]) in the special but important case when the true roots or correlations p; 
are zero. Roy [5] has since investigated the case where the true roots are not 
zero when these non-zero values arise from non-central means. The present 
investigation is primarily intended to cover the alternative case where non-zero 
roots arise from the existence of true correlations p;. The method developed is, 
however, also applicable to the case of non-central means; and it is shown that 
the general distribution, which for more than one non-zero root becomes very 
complicated, does not in the case of non-central means agree with the distribu- 
tion given by Roy [5] except in the case of only one non-zero root.’ 

It will be convenient in this introductory section to sketch (with slight modi- 
fications) the method used by Hsu [8] to obtain the solution in the case of zero 
roots, as some of his intermediate formulae are useful for the present develop- 
ments. We consider a dependent vector variate with p components, and an 
independent’ vector variate with g components. For definiteness we assume 


! This conclusion has also been reached by T. W. Anderson, who has given a solution 
of the non-central means problem in the cases of either one or two non-zero roots, (Annals 
of Math. Stat., Vol. 17 (1946), pp. 409-431). 

2 This classification of a variate as the ‘‘dependent variate”’ or ‘“‘independent variate”’ 
is in the regression sense, and does not necessarily imply statistical dependence or inde- 
pendence. 





2 M. S. BARTLETT 


Pp < q, and the sample with n(>p + q) degrees of freedom corresponding to the 
dependent variate is divided in the usual way (see, for example, [6]) into a part 
with q degrees of freedom corresponding to the independent variate and the 
remaining part with n — q degrees of freedom. If a;;, b;; denote the sums of 
squares and products corresponding to this division, then it is known that the 
joint distribution of a;; and b;;, if the dependent vector variate is normal and 
actually, in the statistical sense, independent of the second vector variate, is 


Pp 
|A| 4(q—p—1) |B | 8-2-2) exp | - ; b ¥ (ais + 6.) | da db 
i=l 


(1) g—! 
gin ip p—) II {T[3(q — 2)IV[3(n — q — i))]} 


where | A | denotes the determinant of the matrix A = {a;;}, and da the product 
of differentials da;;, and where for convenience the variance matrix of the 
dependent variate is taken to be the unit matrix. 

We make the transformation specified by 


A = WDW’ 


(2) 
A+B=Ww’, 


where D is a diagonal matrix of the quantities 7; in descending order of magnitude, 
and W = {w,;} is a matrix (with transpose W’) uniquely determined by (2) 
except for an ambiguity of sign for each column; this ambiguity can be eliminated 
by choosing positive elements in the first row. The Jacobian A of the trans- 
formation may be shown to be 


Pp 2 - 
, (r; — Yj). 
+ 


Pp 
(3) A=2?| ww’ |? T] 


i=1 j=i 
By direct substitution, we obtain from (1) the distribution 
plais, bij) = plwis, 7) = p(ws;)p(ri), 


where p(x) is a general notation’ for a distribution function in one or more 
variates x, (including the differential elements); for p(w;,;) and p(r;) we have 


Pp 
(4) pws) = C| WW" | exp | — 3 D> wis | de, 
1 


i,j= 


(5) piri) = OTT {yer a — yer” TL oi — Dh ae, 


i=1 j=i+l 


3 The probability symbol is not of course to be confused with the number p of components 
in the dependent variate. It should also be noted that for convenience p(z;) is used to 
denote the joint probability for a set of quantities z; , whereas p(x.) or p(x2) denotes the 
probability for the specified variate zx; or x, considered separately. 















CANONICAL CORRELATION 3 
‘the constants C, and C; being arranged to give unity on integration of p(w;;) 
or p(ri), i.e. we have 


(6) Cy = 2 OTT (ria — O/T - dN, 


(the w;; varying from — ~ to o except that wi; > 0), and 
(7) G= TT (Tn — DV CER@ — DITR@ — DIT — ¢— op}. 


3. Formal determination of the general distribution. The method to be 
adopted of obtaining the general distribution from the particular case quoted in 
equation (5) above is the same in principle as the one adopted by Fisher [7] in 
his derivation of the general distribution of the multiple correlation coefficient. 
Since the argument is more involved in the present problem, it will be presented 
first in formal probability terms, before the details of the solution are examined. 

We consider a transformation of the components of each vector variate to the 
true canonical components. Let the observed ordinary correlation coefficients 
of these mutually independent components for one vector variate with the 
corresponding components of the second vector variate be denoted by s;. The 
true correlations are the true canonical correlations p;. Then we have for the 
general canonical correlation distribution denoted by* p(r; | p:), the expression 


p(r; | pi) = [ ves | p:) 


[ vores, edp(s: | 6d 


[ ve | si)p(si | pr) p(se | p2) --+ p(Sp | pp), 


the substitution p(r; | s;) for p(r: | s:, p:) following from the sufficiency of the 

independent correlations s; of the corresponding pairs of canonical components, 

as statistics for the p;. We now define the function g(s: , p1) by the relation 
p(s: | pr) = p(si | pr = 0) g(si, pr), 

whence we have the general solution 


p(ri | pi) [ vo | si)p(si | px = O)g(si , pr) p(s2 | p2 = O)g(s2 , pr) *°- 


(8) [ ve » $i | ps = O)g(si , pig (se , pz) -° 


I 


p(r: | pi = 0) [ vs 7s, pi = O)g(s: , prg(Se , pe) +> 
for p(r; | p;) in terms of the special case p(r; | p; = 0). 


4 Quantities to the right of the vertical stroke in a probability bracket are given quanti- 
ties on which the probability distribution depends. 


4 
t 
















































M. S. BARTLETT 


Now according as the independent vector variate is considered as (a) a normal 
variate with which the dependent variate is correlated, (b) a fixed vector in 
sample space (this includes the non-central means case) Fisher [7] has shown that 
the distribution of the multiple correlation R of a single dependent variate with 
an independent variate comprising m components is p(R | p = 0)g(R, p), where 


- g(R, p) = F(4n, $n; 4m; pR’) (1 — p’)”, 
(b) g(R, p) = F(a n;34 m3 BR’) ee 


where we replace p by a parameter §° in case (b), and the notation for hyper- 
geometric functions used is: 


(9) 


a(a + 1)2” ee 
ae+ pat’ 


Oty Qe L i’ ay(a2 + 1)a2(a2 + 1)2” 
B B(B + 1) 2! 


It follows that we may write g(s: , p1:) above in the form 


F(a; 6;x%) = 1 +o¢ 


F(a, 02; 6;%) = 1+ ie nee 


” g(si, pr) = P(E, 3; 35 pisi) (1 — pi)”, 
” a(si, ox) = FO} ns 4; d6iseM 


by putting m = 1 in (9), (the signs of the s; are arbitrary, so that we are essen- 
tially concerned, as in the multiple correlation distribution, with the squares 
of the correlations ). From these series expansions the integral in (8) consists of 
terms corresponding to the conditional moments, for any set of positive integers 
ti, bays hy, 


lta 5 te +++ 5 ty) = BE (si)(98)* +++ (85) | re 
[de® = (5) pler| rs, 06 = 0. 
8% 


In the particular case when only p,; ¥ 0, the moments u(t) = E{(s1)‘ | rs} from 
the single factor g(s; , p:) are all that arise, but in the general case it is important 
to notice that the quantities s; , while statistically independent when unrestricted, 
are no longer independent for the conditional distribution p(s; | r;, pi = 0). 
This completes the formal solution. It remains to evaluate u(t; , &,--- , tp). 


(10) 


4. The conditional moment u(é;, &,--- ,¢,). First of all we note from the 
choice of the components of the dependent vector variate, applying the analysis 
of section 2 to such components, that the multiple correlation R; between the 
ith component and the gq components of the independent variate is given by 

Ri = agi/ (ais + dss) = aati + ars + +++ + aisrd, 
where 


ais = wii/V (wh + wie + ++ + wip). 














CANONICAL CORRELATION 5 





-To obtain the distribution of the a;; from that of the w,;;, we note that the w;; 
distribution (4) is normal (allowing for convenience w; to vary from — © to «) 
except for the “linkage factor’’ 


p—l 
ave |ww' fo TT (rla@ — #)/Tla(m — a]}. 
t=0 
Hence if we transform to the variables c;; , 6;; defined by 


2 2 2 
Cie = Wi + We +--+ + Wi, 


= COS 61 ) 


& 
| 


. a2 = sin 0 COs O2, 
(11) , . 
sin 6; sin 632 COS 63 ’ 













Qip = Sin Oy Sin O2 sin 6,3 «+ SIN Oi,p-1, 

the sets ci; , 0;; which for normal w;; would all be independent with distributions: 
p(cii) — x’ distribution with p degrees of freedom, 

(12) p(0;;) « sin” ?"6;; dd;;, 


(0 < 6:; < rforj = 1,2,---p —2;0 S 65-1 < 27), 









in general retain their independence for given 17, but the linkage factor results 
in an elevation of the x’ distributions to n degrees of freedom, and a linkage 
factor for the 6;; distributions of 


\4(n—p) TT {TI3(p < 2)|T[3n] 
(3) a aa a ae 







where 


A= { anes + apap tes + Dip jp}. 










We may now, having obtained the distribution of the a;; , note their geometri- 
cal interpretation. Let us denote the p components of the dependent variate 
in n-dimensional sample space by the p vectors &,, &,--:,&,. Let the p 
orthogonal canonical components corresponding to the sample canonical correla- 
tions r; be denoted by the p unit vectors x1 , X2,-°-:,Xp,. Let the corresponding 
components for the independent variate be n;, y;. The “linkage factor” 
merely represents the allowance that must be made in the mutual relations of the 
~-vectors for the fact that while they must lie in the p-space of the x-vectors, 


6 M. S. BARTLETT 


they really belong to the original n-space. We may identify the w,;; with the 
coefficients in the equation 


(14) Ei = Wari + WeXe + +++ + WipX, 
where 
B= wa + wie + ++ + wip 


is a x’ with n, and not p, degrees of freedom. If we now suppose for convenience 
~; to be a unit vector, we have in place of (14) 


(15) Ei = aX + ainXe + +++ + aipky, 


with a projection, on the g-space of the y-vectors, of %; , say, where 


Ci = aariys + aero + +++ + aipl pp, 


and hence, as already noted in the algebraic derivation, 
Ri = (& + G)°/G5 = anni + o'er + +++ + apr, 


where (— - %) denotes a scalar product. The linkage factor (13) indicates that 
the —; vectors in (15) are not independent in the p-space of the x-vectors, the 
distribution of their mutual configuration being determined by n-space. 

‘This interpretation enables us to determine the moments of the distribution 
p(si|7ri). For if corresponding to (15) we write 


(16) ni = Bayi F BeWe + -++ + BigYa, 
then 
(17) 8; = anBiati + ai2Biete + +++ + aipBiplp- 


If we are considering case (a), the relations of the n; to the y-vectors in g-space 
will be similar to the relations of the —; to the x-vectors in p-space. In case (b), 
however, the n; , which represent the true canonical components of a set of q 
fixed vectors, must remain strictly orthogonal to each other although their 
relation to the y-vectors can vary. This means that the relations of the n; 
to the y-vectors are determined by a random rotation of a rigid orthogonal set 
of q vectors in case (b). We may note that if in case (a) we allowed n to tend to 
infinity, the n; would also become rigidly orthogonal, so that the solution in case 
(b) may conveniently be obtained from case (a) by retaining the same distribu- 
tion of the a; , and for the 6; lettingn > «. 

Thus in either case the moments of the s; can be obtained from (17) in terms 
of the moments of a;; and 6;; , two independent sets of coefficients for which the 
distribution of each set is known. The above comments suffice theoretically to 
complete the required solution for (s})‘'(s3)"? --+ (s;)'? is a function of a;; and 
B;;; the a;; and the corresponding linkage factor can be expressed in terms of 
sin 6;; and cos 6;;, and similarly for the 6;; in terms of, say, sin ¢;; and cos ¢;;, 
and integration carried out over the 6;; and ¢;;. This method is unfortunately 





CANONICAL CORRELATION 7 


‘too cumbersome algebraically to be of any practical value except in the case of 
one non-zero root. This case is considered separately before the general case is 
discussed further. 


5. The case of only one non-zero root. Here we only require u(t) and a 
comparatively simple solution is possible, the linkages within the &; and n; sets 
being irrelevant. We have in fact, if y is the angle between n, and %, , (where 
t, was the projection of &, in the g-space), that y is a random angle in the q-space, 
since the a;; and 8;; sets are independent. Hence in this particular case we may 
conveniently write s; = Rj cos *y, which is just the transformation used to obtain 
the distribution of the multiple correlation Rj. Thus we may replace (10) by 
(9), where Ri = aiiri + aier2 + +++ + aipr,, and 

w- 5 eee 
Uytugt:+-=t U1: Ugiees 
2(t—uy) 


2(¢—uy—ug) 6 2 
1 


9. . 2ue 2 
- cos! 61; sin 6,, cos “* @,2 sin 


where the expected value of the trigonometric term is evaluated as 
ea {i + PP + §)-: r(3p) 
Q(2)T(2) --- T(gp + 2) 
We have now obtained the distribution, (p = --- = pp, = 0), 
p(rs | pr ~ 0) = plri| pr = 0) Douyug--» Cur, Ue, +++) (7i)(72)"* +++, 
where p(7; | pi = 0) is given by (5); and in case (a) 


as teen t =o 
Ctr, ue, +++) = (1 = pi)! ciy'| ae 


T(4p)1'(3q) ES - PT 


“TGp + OTG¢ + Dial TG)u! 


and in case (b) 
oy ee oagye FGn + OrGr ra | + D| 
Clay way ++) = eG Tera + OTGa +O aL Taw! | 
where % + t+ --: + u,is denoted by ¢. teas -- denotes summation of 


all w’s from 0 to «. The solution in either case contains a generalized hyper- 
geometric function. If we denote the general series 


> pages r(nyr(m) pat 
ee T(a)l(e2) = T(n+dl(n+tiarL T(8,)u! 


Tia+t) V(r)F (rm) ° ead sel | 
Dowssuas rete in + Orr +H Lh L 1(B;)u;! 


F(on , a2 3 Bi, Boy °** yBp3 1,72 51, %2,°** Up), 


F(a; Bi, Bo, °** Bp 31,72 5X15 X25 °°* » Lp) 





8 M. S. BARTLETT 


respectively (see [8, p. 300, example 22]), then we have in case (a) 
P(rs | pr ~ 0) = pl(ri | ps = O)(L — pi)?” 
XF(3n, 2N; 3 i. ea $; 


(19) 


and in case (b) 
pr: |B: # 0) = plri| p: = Oe * 
XF(S 2; 3, 3, °° » 43 1 "s +++ , $Bir,). 


An alternative operational form is obtained by noting that the sum of terms for 
given ¢ = um + wu. + --- + u, is generated by means of the coefficient of z‘ in 


(20) 


p 
2.2 \-4 
II (1 — Pit; z) ’ 
j=l 
where for definiteness we consider case (a). Hence if we write 


¥; T(a, + 2)P(a2 + 2) P(r) T(r2) t 


eee Fa)Fa) Fat Ol +O" 


we have 


1 oh a ae 22 22 
F(3n, 2; > 2) °°* 9 25 BP, 25 Pili y Pil2,°°° » Pil’ p) 


(21) 


Pp 
—l 2.3 _— 
= OF (gn, 2n; 3D, 2932 ) I] (1 — pirjz)’, 
j= 


where 9 denotes the operation of taking the term independent of z (this might 
possibly be done by multiplication by z* and evaluation of a suitable contour 
integral, but in the use of this formula here the operation © has been carried out 
directly). 

It is of some interest to examine a simple case, and, incidentally, to check that 


[. p(ri | ps) = 1. 


If we take p = 2, q = 3, we obtain for p(rj , 72 | p1. = p2 = 0) the form 
A(n — 2)(n — 3)(n — 4)(1 — 74)?” 9(1 — 13)” (rt — 13) dri dr? . 


Considering the distribution (19) with p = 2, q = 3, and taking the most ele- 
mentary case n = 6, we obtain on integration of r; from 0 to 7; , 


2 
P(r: | pi) = 6(ri)” dr\(1 ree" pi)” indie a2 


_ 7G) ra + wl + w)(eiri)' 
M3 +t) PQ)TQ)u!l (ue + 2)!t! * 


where ¢ = u + uw. Now from the identity (1 — z)4(1 — 2)! = 1 — a, the 














CANONICAL CORRELATION 9 
- coefficient of x‘*’, (¢ > 0), is zero in the expansion of the left-hand side. This 
provides the identity, for all t > 0, 
y, PG wg + t= whe} _ TG +t+1)_ 1G +t+2) 
uy 


T(3)ui!T(3)(ue + 2)! Tr(a(¢+ 1)! (a) +-2)! 


_ 3+ 3)PG +2) 
rgyre+ 3)’ 


























or 


rg t+uyrg+ti-—um)_t+3 TR+28 
at dw T(3)m!T(3)(w +2)! 38 T(3)r(t+3)° 













Hence 


r(3 + 24) ¢@4+ 3) 


aol 22527, 28 fo TO) WT 9) 7 2 ave 
P(r: | pr) = 6(ri) dri(1 — pi) De (TE 3 (0171) 

(23) 2\2 72 2 r(3 + ¢)( ini) ( + 3) 
= Ce) dr\(1 oe pi) > — — 


= (1 — pi) drid/ari{(ri)*(1 — piri) S, 
which obviously gives unity on integration of r} from 0to1. In purely algebraic 
form 


(24) p(ri | px) = 3(1 — pi)°(ri)? dri/(1 — piri)‘. 


Alternatively, making use of formulae (21), we have for the same case p = 2, 
q = 3,n = 6, the distribution 


(25) 6(ri — 13) dr} dr3(1 — pi)* OF (3, 3; 3; 27) (1 — piriz)? (1 — pirde). 
Integrating with respect to r; from 0 to rj , we obtain 
3 > . — 2 2 Ds 
(26) 6dr(1 — p)OF(3, 3; 3327) 1 — prez) oe ‘ ee. 
piz 3(912) 

Discarding the term for which the irrational expression (1 — piri)? cancels, 
and hence leaves no terms independent of z, we obtain the distribution p(n: | p1) 
given in (23) or (24) by selection of the appropriate terms. We may further 


integrate directly the expression above with respect to rj , and after discarding 
again irrelevant terms we obtain 





(27) 6(. — pi)'eF(B, 3; 3327 {- tt = sieht, 
3(p12)° 


which is readily ascertained to be unity. 


6. More than one non-zero root. In the general case the factor multiplying 
p(r; | ps = 0) is rather remarkable in being symmetrical in both the set r; and 


10 M. S. BARTLETT 


the set p;. As n increases, the convergence of 7; to p1 , T2 tO pe, etc. when the 
p; are also arranged in descending order of magnitude must result from the 
restriction 7; > 7 > -:- > r,. The limiting distribution has been discussed 
by Hsu [9]. 

In view of the algebraic difficulty of obtaining u(t, &,---+ , tp) by direct 
integration, an unsymmetric method of obtaining the moments was developed. 
This is fairly tractable in the case of two non-zero roots. The second set we; 
of the original variables is transformed by an orthogonal transformation such 
that the first new variable of the second set is determined by the correlation 
between w; and w.;. We may write, for example, 


wy = (Wy + WW» + -- -)/(wir + Wi2 + -°-+ Wi)’, 


9 2 
—_ +f o++ + Wi p)War 





+ Wy2W2 +: ; 
, Wit 
Woo 2 ee 


= + * - Wi p) wis + °° - Wi iat rs 
11 


Ww 








Ws =o s — 2 2 a ae 
- +: tk + wip)(wis +--+ + = 


2 
Wi2 


- W»2(Wi3 + a _—— Wip) Wa + W13 W23 + ** ‘ 








which conversely we can at once express asa relation of the w2; , in terms of the 
, . . . ° . . 

w2;, (since the reciprocal of an orthogonal matrix is simply its transpose). If 

we write 


= wyy/[(war)? + (wae)? + ++ + (w2,)", 


(29) ty 204 
w2/[(we1)° + (we)? +: + (wep) | ’ 


and write further 

Qq= COS O11 , Qe = COS O12 , cee , O1 = cos 7 

be = COS 022, °°* , Where a3; = COs 621 , a22 = sin 621 COS O39 , 
we have in particular 

Qo, = Aid; — be Vl _ a’) wit ~ &, 
(30) a2 = deb; V (1 — aj) + debe Via — bi) 

—b V1 = a3) V(1 — bi) VL — &), 

where the distribution of the a’s and b’s is proportional to 


{(1 — az)*® day} {(1 — a3)*? dag} - + §(1 — b3)*"® dba} {(1 — b)*? dbo} - « - 





CANONICAL CORRELATION 1l 


For the reasons discussed in section 4, it will be noticed that only the distribu- 
tion of b; in the a, b set is affected by the linkage factor. By such methods the 
expressions 


w(1, 1) = Efsis: | rs}, w(2, 1) = Ef sise | rij 
were fairly readily obtained. If we introduce the notation 
Dp 
B= Lr), Sa = » (ri) (ri) 
t= ty6j 


and also symbols for the products of the a and 8 moments, viz. 


2 ‘ 7 2: ‘ . 
(') = E{ aj10}2} E{Bi:Bi2}, ( ; = E{ ai0%9} E{Bi:839} , 


etc., we may list the moments u(t , 2, «++ , tp) as in Appendix I, which gives all 
moments up to the fourth order in terms of the a and 8 moments (the numerical 
coefficients arise from the numbers of ways of forming the two-way partitions). 
‘“‘Half-factors” corresponding to the a moments are listed in Appendix II against 
their appropriate symbol, the corresponding factors coming from the 8 moments 
being obtained in case (a) by writing qg for p and in case (b) by writing also* 
n— o, Thus in case (a) 


a n+2 n+2 
w(l, 1) = sas 5 | at + 5 | ” 
+4 np+n— 2 I ngqt+n—2 | 
np(p + 2)(p — 1) I Lng(q + 2)(¢ — 1) 


ao | | 
= Ee + 2)(p — 1) | Lng(q + 2)(q — 1) 281, 


and in case (b) 


_|_"t2 {7 _ 1! Io 
wl, 1) = lat + 5 Fe + 5 |S 
(33) 


ete —2)(q + 1) + An — a as 
np(p + 2)(p — 1)q(q + 2)(g — 1)) 


By means of the transformation (28) it is possible to develop the moments 
u(t; , 4) in the ease of two non-zero roots, though in obtaining the results quoted 
in Appendix II, where the formula for u(3, 1) and y(2, 2) are included, it was 
found convenient to supplement this method with the devices mentioned in the 


5 It should be remembered that we have assumed p < g. If p > q, we interchange 
the dependent and independent vector variates, and hence must interchange p and q in 
these moment formulae, p(<q) now corresponding to the independent variate. 





12 M. S. BARTLETT 


next section. In the case of more than two non-zero roots, it is theoretically 
possible to carry out a further transformation on the ws3; variates, but with the 
“partial” variates w1;.2 = Wi; — bwwe;, where 


bis = (Winwa + Wr + -° *)/(wir + Wie + ee), 


as coefficients. This enables us to express w3; in terms of new variables, of 
which the first is related to the partial correlation of w3; with w,,; for given we; , 
i.e. to the second correlation factor which depends on the ‘‘linkage’’; and so on. 
This method is, however, again too cumbersome to be of much use, and a more 
rapid method of evaluating u(t, &, --- ,¢p)ingeneral is desirable. This problem 
has not been entirely solved to the author’s satisfaction in this paper, although 
in the concluding section are mentioned devices which have been found useful, 
and which enabled the terms for the remaining third-order moment yu(1, 1, 1) 
to be completed and added to Appendix IT. 


7. Relations among the a-moments. Equation (15) defining the a’s, the 
£; being random vectors in the p-space of the x-vectors except for their mutual 
configuration being determined by the properties of n-space, may be used to 
provide relations among the a-moments. Thus in addition to the identities 


(34) an tagt--: +a; = 1, (4 = 1,2,---,p), 


the correlation of any ~; with a fixed vector in the p-space, e.g. with x; or with 
(x1 + x2) /+/2, is a random correlation in p-space, whereas the correlation of any 
€; with any other £; is a random correlation in n-space. The use of these facts 
is best illustrated by an example and equations sufficient to determine the 
six a-moments required for u(1, 1, 1) will be derived. 

For convenience, denote the required mean values of 


he es 


: 3 2 2 2 2 2 2 2 2 2 
11191031 » O11A21Qz2 , 1122033 , ALjy1OLjoMl21OlgQ90l31 » C11 1221292033 , j1Aj2Ag2Qlo3M%31 O33 


e ° sa: 2 2 
by A, B, C, D, E, F respectively. Multiply the second-order quantities aj1a21 , 
2 2 . e ° ° e —— ° 
0441099 , 21101202129 by expression (34) for 7 = 3; since this expression is identically 
unity, the consequent mean values are unaltered. This gives the three relations 


A+ (p — 1)B = (n+ 2)/{np(p + 2)}, 
(35) A+ 3(p - DB + (p — 1)(p — 2)C = 1/p, 
A+ (p—1)B + 2(p — 1)(p — 2)D 
+ (p — 1)(p — 2)(p — 3)E = 1/(np). 
The moment A is the mean of the triple product of the squared scalar products 
of &, , & and &; with x;. The same value must be realized with any other fixed 
vector in the p-space, e.g. with either (x, + x2) /+/2 or with (x; + x2 + --- + x,) 
/Vp. This gives two relations 
A-B-4D=0 


(p+ 1)A — 3B — 12D — (p — 2) (C 4+ GE + 8F) = 0. 


(36) 





CANONICAL CORRELATION 13 


A final linearly independent relation is obtained from the mean triple product of 
(E, - 2), (E1 - Es), (E - &s), which depends solely on the internal configuration of 
£, , &. and &; , and is easily shown (e.g. choose & to coincide with one of the original 
axes of the n-space) to be 1/n”. This gives 


(37) pA + 3p(p — 1)D + p(p — 1)(p — 2)F = 1/n’. 


The equations contained in (35), (36) and (37) determine A, B, C, D, E andF. 

Similar equations could evidently be constructed for the higher-order moments, 
e.g. for the terms required for u(2, 1, 1) or u(1, 1, 1,1), but the numbers of such 
terms increase rapidly. From Appendix I it will be seen that there are 24 
distinct terms in p(2, 1, 1) and 16in w(1, 1, 1, 1). 


Appendix I. 


w(1, 1) = Ss, (;) + 28u{ (? >) + 2({ i) 
u(2, 1) = 8 (5) + suf (* 2) +6 (5 "+s (3 ‘)} 
+ 65m {3(? : 5) +12() *)} 
13,0 = 5(S)+n{(° 3)+15(5 )+2(6 1) 


saul om 9) 
( ) 
+ 28m (15 (4 , 5) + 45(5 . *) + 120($ | *) +30(7 | ‘\ 

+ 2480 415 (? dln >) + 90 - ° a 
. es 1 

u2,2) = (4) + Su {12 (¢ >) +16 C ‘)} 
va eG Dont 

as 2 2 1 2 3 1 


22 - ' 2 > & 3 





M. S. BARTLETT 


2 2 - 
nn 3s2) = 8:(3) + 5 (2 ) (1 
2 - 2 
( /2 : 4 ’. et 1 
rae 2 a 11: a 
i - ¥ - « @ 1 


3 
+ 16{1 
2 
3 1 
+16{1 1 
‘ o 1 
+ 3(2 - . 
im |g . 2 





i 


CANONICAL CORRELATION 


Appendix II. 
(3) n+2 ? ») np+n—2 a —(n — p) 
2) np(p + 2)’\ 2) np(p + 2)(p — 1)’ \1_ 1) np(p + 2)(p — 1)’ 
(;) 3(n ++ 4) ‘ ») 3(np + 3n — 4) 


np(n + 2)(p + 4)’ 2) np(p + 2)(p + 4)(p — 1)’ 


*) np+n+2p—4 —_ np + 3n — 4 
np(p + 2)(p + 4)(p — 1)’ 2] np(p + 2)(p + 4)(p — 1)’ 


') —3(n — p) (; 1 *) —(n — p) 
1} np(p + 2)(p+4)(p—1)’\1 1 -/ np(p + 2)(p + 4)(p — 1)’ 
2] np(p + 2)(p + 4)(p + 6)(p — 1)’ 


np(p + 2)(p + 4)(p + 6)’ 
3(np + n + 4p — 6) 
np(p + 2)(p + 4)(p + 6)(p — 1)’ 
(‘ ; 3(np + 3n + 2p — 6) 
2/ np(p + 2)(p + 4)(p + 6)(p — 1)° 


(; 
(; 

() aor eer 15(n + 6) (° ») 15(np + 5n — 6) 
) 


(2 


(’ 2 ») 3(np + 5n — 6) 
2] np(p + 2)(p + 4)(p + 6)(p — 1)’ 


c 2 7 pe ie ine 
2 ‘} np(p + 2)(p + 4)(p + 6)(p — 1)’ 


(? 2 2 ,) np + 5n —6 
, 2) np(p + 2)(p + 4)(p + 6)(p — 1)’ 


(i) a 
1 1) np(p + 2)p + 4)(p + 6)(p — 1)’ 








16 


( 


i 


1 
1 


(1) 


(‘ 


i gam f, gm, 
wo Dw SF Dr ww 


me Ww Ww 


— Ww jb pt 


Ly, LO Ls 
— 


— 





M. S. BARTLETT 


—9(n — p) 


np(p + 2)(p + 4)(p + 6)(p — 1)’ 


(3 1 *) —3(n — p) 
1 1 -) np(p + 2)(p + 4)(p + 6)(p — 1)’ 


‘) 8M P) 
-] np(p + 2)(p + 4)(p + 6)(p — 1)’ 


E 1 .*)- _ a 2 
1 1 + +} np(p + 2)(p + 4)(p + 6)(p — 1)’ 


9(n + 4)(n + 6) 





n(n + 2)p(p + 2)(p + 4)(p + 6)’ 





pet be 4 n'(p + 3)(p + 5) +2n(p + 1)(p + 3) — 8(2p + 3)} 
4), nn +2Qpp+D>+Do+ORp—-Dpth ” 
‘2 {n*(p + 3) + 6n(p + 1) + 8(p — 3)} 
n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)’ 
3) ae + 4p + 15) + 6n(p + 1)(p — 3) + 4(5p° + 2p — 6) 
n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)(p + 1) . 
J 3{n(p + 3)(p + 5) + 2n(p + 1)(p + 3) — 8p + 3)} 
2 2/) n(n+2)p(p+2)(p+4)(p+6\(p—1(pt+1) ’ 
2 J n'(p + 3)? + 2n(p + 1)(2p + 3) + 4(p° — 4p — 6) 
2) n(n + 2)p(p + 2)(p + 4)(p + 6)(p — I(p +1) ’ 
2: ») n'(p + 3)(p + 5) + 2n(p + 1)(p + 8) — 8(2p + 3) 
2 2) n(n + 2)p(p + 2)(p + 4)(p + 6)(p — L(p +1) ’ 



































') ee et 

n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)’ 

‘ : __ =n — p)(np + 3n + 2p) 

3/ n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)(p + 1)’ 

1 2) —(n — p)(np — 3n + 8p + 12) 

1 n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)(p + 1)’ 

1 >) —3(n — p)(np + 3n + 2p) 

1 n(n + 2)p(p — 2)(p + 4)(p + 6)(p — 1)(p + 1)’ 

1 2 ») = (n = p)(np + 8n + 2p) 

1 2) n(n + 2)p(p + 2)(p + 4)(p + 6)(p — I(p + 1)’ 
oe ') 3(n — p)(n — p — 2) 

1 1 n(n + 2)p(p + 2)(p + 4)(p + 6)(p — 1)(p + 1)’ 







CANONICAL CORRELATION 








(n + 2)(n + 4) 2 *\ (+ 2)(np + 3n - 4) 


n* p(p + 2)(p + 4)’ . 9) Mp + 2)(p + 4)(p — 1)’ 





no wd bd 


_\ np? + 3p — 2) — 6n(p + 2) + 16 
9) ™ P(p + 2)(p + 4)(p — 1)(p — 2)’ 









i —(n—p)m+2) _ 
9 .| pp + 2)(p + 4)(p — 1)’ 


junk 


Lo) = pip + 2n — 4) 
9)” p(p + 2)(p + 4)(p — 1)(p — 2), 









1)_____ ™—p)Q@n—-p) 
1) @ P(e + 2)(p + 4)(p — 1)(p — 2)" 


REFERENCES 

\1] H. Horeuina, ‘Relations between two sets of variates’’, Biometrika, Vol. 28 (1936), 
pp. 321-377. 

[2] R. A. Fisner, ‘‘The sampling distribution of some statistics obtained from non-linear 
equations’’, Ann. Eugen., Vol. 9 (1939), pp. 238-249. 

[3] P. L. Hsu, ‘‘On the distribution of roots of certain determinantal equations’’, Ann. 
Eugen., Vol. 9 (1939), pp. 250-258. 

[4] S.N. Roy, ‘‘P-statistics or some generalizations in analysis of variance appropriate to 
multivariate problems’’, Sankhy@, Vol. 4 (1939), pp. 381-396. 

[5] S. N. Roy, ‘‘ Analysis of variance for multivariate normal populations. The sampling 
distribution of the requisite p-statistics on the null and non-null hypothesis’, 
SankhyG, Vol. 6 (1942), pp. 35-50. 

[6] M. S. Bartiert, ‘‘The vector representation of a sample’, Proc. Camb. Phil. Soc., 
Vol. 30 (1934), pp. 327-360. 

[7] R. A. Fisuer, ‘‘The general sampling distribution of the multiple correlation coefficient’’, | 
Proc. Roy. Soc., Vol. A 121 (1928), pp. 654-673. i 

[8] E. T. Wairraker anp G. N. Watson, Modern Analysis, Cambridge Univ. Press, 4th 
ed., 1935. 

[9] P. L. Hsu, ‘‘On the limiting distribution of the canonical correlations’’, Biometrika, 

Vol. 32 (1941), pp. 38445. 















ON THE THEORY OF MARKOFF CHAINS 


By Exuiorr W. MontrRo.u 
University of Pittsburgh 


1. Summary. Although there exists voluminous literature on the theory of 
probability of independent events, and powerful techniques have been developed 
for the analysis of most of the interesting problems in this field, the theory of 
probability of dependent events has been rather neglected. The first detailed 
investigations in this subject were published by A. Markoff [1]. 8. Bernstein [2] 
has extended the fundamental limit theorems to chains of dependent events. 
The most extensive exposition of this field has been made by M. Fréchet [8]. 

In the present paper we shall develop methods of averaging functions over 
chains of dependent variables and find the probability distribution of these 
functions. It will be shown that for certain types of chains these averages and 
distribution functions can be expressed in terms of the characteristic values and 
vectors of a certain operator equation. Many of the methods discussed here 
have been applied to problems in statistical mechanics [4, 5,6, 7,8]. The most 
important application has been made by L. Onsager [8] who proved rigorously 
(on the basis of a simplified model) that Boltzmann’s energy distribution in a 
solid with cooperative elements leads to a phase transition. The first explicit 
application of linear operator theory (through matrices and integral equations) to 
probability chains has apparently been made by Hostinsky [9]. 


2. Introductory Remarks. Suppose there exists a chain of events each of 
which might lead to one of »v possible results, and which are correlated in such a 
manner that the probability of n successive events leading to a chain of results 


Qy ; a ; eee ; An 
is proportional to 
Pyar, @2,°** 5 Qn). 


The probability of a given function F(a , a2, +++ , an) having a value correspond- 
ing to the sequence of a’s would be proportional to 


F(a, a2, °°* » Qn) Pali, +++ , On) 


and its average value over all configurations of the chain would be 


(1) P = F,/Fo = Do F(a, 02, +++ ,0n)Palar @2,°°° 1 n)/ 24 Palas *** a9) 


{aj} 
where 


(la) FP, = a [F(a , 2 ,a3,°°° »Qn)|" Palaa, °° , On) 
18 










MARKOFF CHAINS 


and the summation extends over all values of 







{aj} = (a1, a2, °°" , Qn). 


The probability of a result a, of the first event leading to a result a, of the 
nth event is 


(2) Py(a1, On) = (1/Fo) Zz Pn(a1, Q2,°** 5 Qn). 


@2,°°*sGn—1 















In order to find the probability of a given function F(a, --- ,a,) having a 
value between é and é + h it is useful to know the moments and Thiele semi- 


invariants of F(a,,--+,an). Both of these functions of F can be calculated 
from 

(3) Z,(z) = oo 2 Pla, oar » On) exp {xF(a,, a, ae Qn)}. 

Obviously 







(4) Fn = lim a” Z,(x)/dx™. 


It is known [10] that the mth Thiele semi-invariant is given by 


(5) Am = lim 0” log Z,(x)/dx”. 
z—0 















In the notation of Cramér Z,(iw)/Z,(0) = f(w), the characteristic function of F. 

If G(z) is defined so that G(é + h) — G(&) is the probability that the function 
F(a, ,°+* ,@n) has a value between é < F(a,--- , an) < & +h, then it is well 
known that [5] if G(z) is continuous atz = anda =é+h 





(©) GE +® — 6@ = se tim | C=” exp tog f(w)] de 
where 
(6a) log flo) = YAO" = aa liay"/mi + ofa) 


When the derivative of G(é) with respect to ~ exists, the probability of 
F(a,+°° 






. 






having a value between é and ¢ + déis 


(6b) oft) a = (0G/ae) dz = & tim [” exp (z, An (ie) "/mi} oda, 


From (4) 





(7) z Am(iw)"/m! = —log Z,(0) + lim e #212? log Z (2x). 


m=1 


20 ELLIOTT W. MONTROLL 


Since, for a constant c independent of z, 


6/2 F(x) - f(a + c) 
we have 


°c 


(8) x Am(iw)"/m! = log {Zn(iw)/Zn(0)}, 
and from (6) 


T —iwt —iwh 
@) GE +h) —G@ = 3 tim | 2 O— 2 Vn) de 
T—0 /—T wZ (0) 

Equations (3), (4), (5) and i indicate that much information concerning a 
chain of correlated events can be obtained from a knowledge of Z,(x). We shall 
now introduce procedures for the determination of Z,,(x) for several general forms 
of P(a; , —~ , Qn). 

When a is a continuous variable, the results of this section and those to follow 
are easily generalized by replacing the summations operations over all values 
of the a’s by integrals, and by replacing the matrix equations of the next section 
by integral equations. 


n—l 


3. Simple Chains, P,(a, +--+, &n) = I] p(a;, o;41)- 
j=1 


a. General theory. By a simple chain we shall mean a sequence of events, 
each of which leads to one of » possible results and which occur in such a manner 
that if the result of the kth event is a, , the probability of the (k + 1)st one 
yielding a result a,4: to proportional to p(a,, ax4:1). This implies that the 
probability of the occurrence of the sequence of results 


Q1, 02, °** 4 Qy 
is 
(10) Ul D(a; ’ O41) 2 Tl pla; ’ Qi+1), 


and the probability of a first result a; , leading to an nth result a, is 


(11) Pr(a1, &) = Zz a p(a;, 0541) a I p(a;, O}41). 


20° %s@n—1 Jl 
The summations are to i extended over all v nai ie of each a; indicated 
on the summation indices. Chains of this type are sometimes called simple 
Markoff chains after the first author who studied them systematically. 
From (1), the average value of a function F(a: , +--+ , an) is 


n—1 


-* Be Plas +++ 0) LE (os, aes) 
(12) F,/Fy = = 
a S ia pa; » Oj+1) 


a, j=l 








MARKOFF CHAINS 21 


Many chain functions F(a; , --- , a,) of interest are either additive or multiplica- 
tive and of one of the forms 


(13a) a) Filar, +++ an) = har, a2) + Alar, as) + +++ + Alan, On) 
(13b) b) Fe(ar,--- , an) = glar, ae) glare, as) -+* glan-1, On). 
In case (b) it is convenient to define a new function h(a; , a;) by 
(14) g(ai, aj) = exp[zh(a; , a;)} 
and in both cases to consider a function of the form 
n—1 

(15) Z(t) = 2 I] P(a;, a;41) exp [th(a;, oj41)], 

as) i= 
for then the values of F; and F; averaged over the entire chain are given by 
(16a) <Fy>a. = lim d log Z,(x)/dx 

z—0 

and 
(16b) <Fo>av. = Zn(1)/Zn(0). 


When n is large, the direct evaluation of (15) may become quite difficult 
because of the large number of variables involved. As an alternative we shall 
now introduce a procedure that is based on the observation that Z,(x) is the 
sum of the elements of the nth power of the matrix 


p-(1, 1) p(1, 2) as p2(1, _ 
(17) P. = p2(2, 1) pz(2, 2) ae pz(2, 2) 


pz(r, 1) pz(y, 2) en pz(v, v) 


where the elements p,(a, 8) are defined as 


(18) pz(a, B) = pla, 8) exp[zh(a, B)]. 


a and 8 range over the same set of values as one of the “result”? parameters 
a; ; and each of the »v possible results is represented by a unique integer of the set 
1,2,---,». Thus Z,(x) = sum of elements of P?*. To employ this observa- 
tion to advantage, let us consider the characteristic values and vectors of the 
matrix P,. It is well known that if the characteristic values are simple the 
characteristic vectors form a biorthogonal set; that is, if 


(19a) 9:2 = (yi,z(1), Giz(2),°**, gi,z(v)}, (¢ = 1,2,---,»), 
and 

¥i,2(1) 
(19b) Vie =| Wi,2(2) 

Yi,z(v) 





22 ELLIOTT W. MONTROLL 


satisfy the operator equations 
(20a) ®; 2 : P, = Ai,2Pi,2 
(20b) Pe ' Viz _ Ni zVi,z 


where );,z is the 7th characteristic value of (17), then 
@;2°Vj2 = Dd, ¢i,2(a)ji2(2) = 0 when i ¥j. 
a=] 


We shall for convenience always assume that the g’s and y’s are normalized: 
®; 2 . so = 1 
so that in general: 
0 when i ¥j 
(21) ®i,2°Vj2 = 415 = — 
1 when 7 =‘. 


It is well known from matrix theory that one can expand a matrix element as 


(22) pz(a, B) = 2 i.e Gi,2(B)Pi,2(@) 
and that 
(23) Xi,z — ®; - : P. ‘i Wine . 


By substituting (22) into the expression for Z,(x) in terms of P?’, one can 


show that 


Z,(x) is E MF ae iz esata) a Yala) 


(24) 
= Die ie" 1)(1-Vi,2). 


Therefore Z,(x) can be determined from a knowledge of the characteristic vectors 
and values of the matrix P,. . 
If there exists a largest characteristic root \;,2 such that 


(25) Are > | Ase | ifi ~ L, 


one can obtain some interesting results. Before deriving these, we shall give a 
sufficient condition (which is satisfied in many chains) for the existance of this 
inequality. Frobenius [11] has shown that if all the elements of a finite matrix 
are > 0, then the characteristic value of largest absolute value of the matrix is 
real, positive, and simple (nondegenerate). Thus, as long as » is finite and 
p:(a, 8B) > 0 for all a and 8, (25) is valid. 

We shall now prove that 


‘ Z1(2) 7 
= -_ se 7 i “ 





MARKOFF CHAINS 


’ that is, 
(25b) Zalx) ~ Ade (1,2 ° 1)(1 - Vr). 


First let us consider the case in which P, is a symmetrical matrix. Then 
¢j,2(@) = W;,2(a), all the characteristic values are real, and 


Zn(X) = AD (@1,2°1)° + 2y Mane" 1)’. 
From Cauchy’s inequality and (21) 


| :,2°1 |? = 2 i,2(@) / < | eile) | [> 1] =», 
Therefore, 
| Ma "oe "| < | DOME Lats? | 


where X,,z is the characteristic value of P, second largest in absolute value. 
This inequality yields 

Z,(2) i|< v(v — 1) i) 
25 ee 4 ee 
-” Nano t) — 1) = Grea) |e 
and (25a) (since A,,2/Az,z < 1) follows. When P, is not symmetrical, one can 
easily derive the analogous expression 

Z,(X) | A(v — 1)|Aea/At,2 |" 


APT (@r2°1)(1-Vr2) (gr21)(1-vr2) 


where 
A = [max {| @,2 - 1) |}}[max {| @ - W,2)|}] 


For brevity, when x = 0, we write \i,2 as Ai, Viz as V; and %,, as ®;. By 
summing (10) over all a’s except ai , a, and a, we obtain the probability of an 
intermediate event leading to a result a; if the results of the first and last events 
are known to have been a; and a,. With the aid of (21) and (22) it is easy to 
show that this probability is exactly: 


(26) 


3 ENE valedeuleadvs(oudei(an) 


me dX vilarei(on) 


21an 
When n is very large, and when we have simultaneously n > > k >> 1, we can 
rewrite this equation to include \; , and neglect all terms containing other 7’s and 
j’s. This leads to the results 
a) If the number of events, n, in a simple chain is very large, the probability 
P,(a,) of a kth event far removed from the first and the last, yielding a 
result a, when a; , and a, are unspecified is 


(27) Pylon) ~ Pilon) gr(ax) / Br > 1)(1 + Vx). 





24 ELLIOTT W. MONTROLL 


b) When k = n, the probability of the result a; - of the first event leading to 
the result a, of the nth event is 


> AP Vilar)yi(on) 
(28a) Pm, &) = ———___—__. 
a Ir x Wi (a1)pi (an) 


So, asn — 


¥i(e1)b1 (en) 
(28b) P,(a1, Qn) eee (@,-1)(1-¥;) ° 


c) When there exists no knowledge concerning the result of the first event, the 
probability of the nth event yielding the result a, is 


(29) P,(an) — dX P,(a1, On) dia @z(an)/(1-z). 


In chains of sufficient length for (25) to be valid, the probability of 
F(a, +++ 5 On) 


having a value between é and é + h has an especially simple asymptotic form. 
From (6) this probability is (if for a given n we let T = an’) 


aco Y—anl/2 \ ® 


1 anl/2 d ; 
Ge +m) — Ge) = tim [" (#) wer 
2771 
(30) ; 
(1 — e *") exp {-te" he 
and from (25) and (5) 
(31) Am ~ 1 lim a” log Xz,2/dz” = nL 
if 
(32) Lm = lim &” log \z,2/dz”. 
z—0 

Letting y = wn, (30) becomes 

GE +h) — GQ) ~ Slim [| & 

271 a0 Ja y 

(33) 


3. 
—ty, _ ,—tuue) —tu?Le _ Lsy a 
(e € )e { — 


where 
(34a) = (€ — Ai)/n' 
= (—E+h—A)/n' 
(34b) = average value of F(a gers a,) = F, 





MARKOFF CHAINS 


‘Integrating (33) 


aaoed e*?!a11 + O(1/n)] dy. 
BA 


(35) G(é + h) — GE) ~ 
Asn — ~ andh—0O 


(35a) G(é + h) — GE) ~ exp (—43)[é — F]/nL), 


h 
(2rnLz)* 
and the probability that — < F < & + h becomes Gaussian. 

b. Examples of a simple chain. As an example of a simple Markoff chain let 
us consider an event which can lead to either of two possible results, say ‘“‘—1” 
or “1”. Further, let us suppose that the probability of a given result being 
followed by an identical one is p and by one of another type is (1 — p); that is, 


p—-1, 1) = pil, -1) =1—p. 
This chain would be encountered in an analysis of a sequence of tosses of a 
coin with a “memory” so that the probability of two successive tosses showing 
the same face of the coin would be =p and that of showing opposite faces (1 — p). 
A question one might ask concerning such a chain is—What is the probability 


of the occurrence of a given number of transitions from one kind of result to 
another? In the chain of results 


1, —1, ~1, 1,1, «1,1, <1, -1, -1 


there would be four transitions, one corresponding to each —1 followed by a 1 
and to each 1 followed by a —1. The function giving the number of transitions 
in a sequence of n events is 


n—l 


(36) Flay, +++, Qn) = 2 h(a 0:41) 
where 
h(—1, -1) =A(1, 1) =0 
h(—-1, 1) = A(1, —1) = 1. 


Even though the a’s are dependent, in this special case, h(a;, ai41) and 
h(ci41, ai42) are independent so that (40) could have been obtained on this basis. 

To apply the methods described in the beginning of this section we must find 
the characteristic values and vectors of the matrix 


(37) oe ( p (l — ” 
, (1 — pe’ p 


(the configuration index a has the value either —1 or 1 in this case instead of 





26 ELLIOTT W. MONTROLL 


“1” and “2” as given in (17)). The characteristic values are the roots of the 
equation 


| p—-r (1— pie 

(l1—p)ey p-d 

that is, 

(38) A1,z p+(1— pe 
[eel =lp—(— pel <M. 


and the characteristic vectors are 


—93(1 - 9i( ! 
Vie = 2 (;) and 22 = 2 2) 


The y and ¢ vectors have the same components in this case because of the sym - 
metry of the P, matrix. Clearly 


Ar = Xi = Ato = 1; Ae = Ao = 2p — 1 
vila) = 27 and y(a) = —a- 27, 


From (26) we see that if the result of the first event in the chain is a; , and 
that of the nth event is a, , the probability of the kth event yielding the result 


Ak is 





(2p — 1)" *arax + If + (2p — 1)" Fave) 


2[1 + (2p — 1)" a1 an] 


As k, n; and (n — k) simultaneously get very large, P,,(a) ~ 3, independently 
of Ak. 
The probability of an initial result a leading to a final result a, is (from 28a) 


P, (a1, &m) = (4) {1 + (2p — 1)"* aan} 

30 that 
P,( 1,1) = P,(—1, -1) = (4) (1+ Qp-)""} 
P,(—1,1) = Px 1, —1) = (3) {1 — (2p — 1)""}. 


Now, to answer our original question regarding the probability distribution 
of the transition function (36) 


n—1 


(39) F(a eo 7 S Qn) == a h(a; ’ Qi+1); 


we use the expression for Z,(a) determined from (24) 


- Zo(t) = 2p + (1 — pel 





MARKOFF CHAINS 27 


From (9) the probability of there being between ¢ and ~ + h transitions in a 
sequence of n + 1 events is 


Get -G@=5,] ea - Ip + Cl — pe*}" do/a 


(40) 


2 [ tut _ -twte—- YO DN (L_ — p)*p"™ 
= on L. @ ' ) 2 (n— klk!” 


Letting x = wh/2 and rearranging 
_ _— 1 aant(i — p)*p"* ( 2 ) 
GE + h) — G@) = 7 Be PU TE +), 

where D(A) is the Dirichlet integral 


0 if |A|>1 


pe) = * Btw se a3 Inj =1 
T x 
1 IA] < 1. 


oo 


We therefore have, when [¢ + h] < n 


- (EAI nt (1 — p)*p”* 
- G6 +0 ~ HO = fy Ge DIR 


Here [x] denotes the greatest integer not exceeding x. The sum is zero if 
lé +h) <[&€+ 1). When[é+h] >n 


x n ni\(1 oe p)‘p* 
(42) GE + h) on G(é) = hes kin — k)!_ ° 


When n is large it is difficult to get a clear picture of the function G(¢) from 
(41) and (42), so we shall develop asymptotic results for large n by using (6) 
instead of (9). : 

By employing (5), we see that (this section will be developed on the basis of 
n + 1 trials instead of n) 


A, = F = n(1 — p) 

A, = np(1 — p) 

A; = np(1 — p)(2p — 1) ete. 
Therefore, from (6) 


© —iw(t—Ay)7y __ —twh 
isms ~ age hf oe st SS 


277 ) @® 


exp [—4np(1 — p)w’ — inp(1 — p)(2p — 1)w'/6 — ---] dw. 








28 ELLIOTT W. MONTROLL 









. 1 
Letting u = wn’, we have 













AG = 1 “dur, —iu(E-Ay)/nd e iuEth—AD indy 
Qri —oo U 
}1 - ip(l — —~ -_- we Aju 4 o(% ‘| e tu? Ptl—p) 
2 zs 7” dd i _ ip(1 ri p)(2p a 1)u® + O u e tu?p—p) du 
2r 4 — 30 6ni n 


where 


mr = (E +h — Ai)/né 
ps = (E — Ay)/n'. 






Since 


ra 4 2 
' — 3\r rv _y2/ 
if ae dei wt Se id ~~ Seee 
— 20 4a5!2 6a 


we have for large n 


1 Me _n2/2p(1—p) 
a | e ?/2p—p 
[2rp(1 oa p)| ee 


—_ ( (2 1)A ” 1 
Pe as p eid + 
, seen mip) + O(;)}a 


Asn — « and h — 0, this becomes 




















oe + ».- 09 ~ Seek — Fae = a) 


[2rnp(1 — p)} 


|, _ Qe — De — F) 1 
te ae + OC} 


A similar problem which occurs in statistics of high polymers can be stated 
abstractly as follows. Suppose there exists a sequence of events each of which 
leads to a translation of length a of a point either to the right or to the left, and 
that the probability of a translation continuing in the same direction as its 
predecessor is p while that of changing its direction is (1 — p). After n trans- 
lations what is the probability of a point being displaced a distance £ from its 
origin. 

If “‘—1” represents a translation to the left and “+1” a translation to the 
right, 


(43b) 


pil, 1) =p 
p(l, —1) = (1 — p) 


I 















MARKOFF CHAINS 29 


- The function giving the distance of the point from its origin after n displacements 
is (when a = +1) 
F(a, oe On) — adi a; = 5a0, + h(e, @ 2) + ee + h(a@n-a, Qn) + 2G, 
fa 
where 
hl, 1) =a, h(-1, -1) = -—a 
hA(1, -—1) = A(-1, 1) = 0. 


Neglecting the terms aa;/2 and aa,/2in F(a, ---+ , a,), one can answer questions 
concerning this problem by evaluating Z,(x) as defined by (15). In this case 


P, has the form 
a 1 —_ p 
P, = . _ Be 
1 ae p p e az 


\1,2 = pcosh ax + [p’ cosh’ az + (1 — 2p)]* = Arye 
| Aoe | = | p cosh ax — [p’ cosh’ ax + (1 — 2p) |" l<uz. 


Its characteristic roots are 


and its characteristic vectors: 


-— |] 
Vie = [(p — 1) + (pe* — nore ; 


pe — ry 


~- } 
Yee = [(p — 1)’ + (pe* — not . ) 


pe* x Xe 


F = A, = lim @ log Z,(x)/dz, 
z—0 


one can show in the present problem that F = 0. Therefore, the probability 
of the translated point being a distance between é and ~ + h from the origin 
after (n + 1) translations, is,asn — © andh—0 


F(E +h) — FQ) ~ h(QenLq) Fe Fs 
where Lz is by (32): 
L, = lim 0° log Xz,2/dx = a’ p/(1 — p). 
Thus, 
FE + h) — F() ~ hla’2anp/(1 — p)e PO Pen”, 
When p = 2/3 this problem is equivalent to the determination of the proba- 





30 ELLIOTT W. MONTROLL 


bility distribution of the components in an arbitrary direction of the distance 
between the ends of alinear polymer. In this case 


F(é + h) — F(’) ~ h(4a°xn)+ exp (—£/4na’) 


a result obtained by Tobolsky [12] after a lengthy and complicated combinatory 
calculation. 

Another type of simple chain is encountered in the determination of the 
“life span” of a particle which is displaced a unit distance to the right or left 
per unit time along a straight line until it collides with an absorbing boundary 
either —(q + 1) or (p + 1) units from the starting point. This problem has 
been analyzed by M. Kac using the methods discussed in the present paper. 
We shall generalize his results to include the effect of an attraction of the particle 
toward one end of the line so that displacements toward that end are more 
probable than those in the other direction. 

Following the notation of Kac [13] we let X ; represent the jth displacement, 
m; its length, and 6(m) the probability of a given displacement having the 
length m. Then, 


8 ifm = 1 
i(m) = 1—s ifm = —-1 


0 otherwise. 


If N represents the life span of a particle, the probability of its exceeding n is 
Prob {N > n} = Prob {-—q < X%1< 7p, —¢@S%4:+A24:S p,°::, 


—q < Xit Xot +--+ +X, < p} = TG(m)h(me) --- 5(m,) 
where the summation extends over all integers m, m2.,---,m, such that 
—qim<D,-qsim+m<D,°*-,-Gomt+mt > +m SP. 

Defining the new set of variables 
aj=qtm+m+--: +m; G ++ m) 


we see that 


p+a 
Prob {N > n} = ie 5(a, — g)d(a2 — a) +++ 5(an — Ons). 
0 


Ay, an= 


As before, if we introduce the P matrix (of p + qg + 1 rows and columns) 


P = (aa — B)) = 





MARKOFF CHAINS 


- we obtain after applying the equivalent of (22) 


pt+q+l p+q 
Prob {N > n} = Dy djoi(9) X Vian). 
?7= an= 


Where d; is the jth characteristic value of P, and y; and ¢; are its associated 
characteristic vectors as defined by (19) and (20) (here the range of a starts 
from 0 instead of 1 as in (17) and (19)). 
It is easy to show that the characteristic values of P are 
Aj = 2s(1 — s)Pcoss; G = 1,2,---,p+qt+)) 
where 


65 = 1j/(p + a+ 2) 
and that the components of the characteristic vectors are 
via) = [2/(p +9 + 2)}s/(1 — 8) sin(a+ 1s; (@=0,1,---,p +4) 
and 
g(a) = [2/(p + q + 2)P[(L — s)/s}* sin (a + 155. 


Since 


SF yl) = V2G = 0) (1 = 1 v'le/t — af PE) cin t 


an=0 Vp +q+2 1 — 2[s(r — s)}* cos ¢; 
we finally have 


(i — g)rtategntt dn—a) 
ay Fe Sage Se ee, 
| pt+tqt+2 


f= 1 — 2v/s(1 — 8) cos §; 


When s = 3 this reduces to the result of Kac: (* means summation is only over 
even 7’s 
2 pt+qt+l 


* ne : iy. 
Segre 2 008" fs sin + Lf; cot 45. 


Prob {N > n} = 

4. Simple Chains with Restrictions. Often when studying chains of dependent 

events, certain functions averaged over the entire chains are known to be 

restricted between definite limits. That is, there might exist k functions 
gj(a1 , @2,°** ,@n) such that 


(44) —AG; < Gj — g(a, +++ an) < AG;, G = 1,2,--- k), 


where the G,’s and AG;,’s are preassigned constants. To calculate averages of 
other functions (1) is no longer valid, for it is an unrestricted sum over all sets 











32 ELLIOTT W. MONTROLL 


of a’s, including those incompatible with (44). All unrestricted sums in this 
formula (and other similar ones) must be replaced by sums over only those 
sets of a’s compatible with (44). Since it is sometimes more difficult to evaluate 
restricted sums than unrestricted ones, we shall apply an idea of Markoff [1] 
to the reduction of the former to the latter type. 

Let us seek an explicit expression for a function P¥ (a ,Q2,°**, Qn) Which 
has the property: 


P* (ay ,°°* 5 Qn) = Pa(a,-+-,an) When a’s are chosen 
so that (44) is satis- 
fied of all j. 
0 otherwise. 


Since the Dirichlet integrals 


1 f* sin (p;AG; y 
6; = =| sin 126) exp (tp; vi) dp; 


? 


have the property 
5; = 1 when —AG; < y; < AG; 
0 otherwise, 
P(a1,°** On) = 8152 °** 5¢Pn(an, -** 5 On) 
has the required character provided 
Vi = Gj — gla, +++ , On). 


The average value of a function F(a; , --- , a,) can be written in terms of the 
unrestricted sum 


P = Di Fars +++ y an)P (aay +++ 5 n)/ 24 Palas ->- » Ma), 


{ae} 
where the summation extends over the complete set of {a,}’s 
{ae} — (a1, a, cd » Me). 


As in the case of chains without auxiliary restrictions, a useful function is 


Z2(2) — - P* (a1, ae » On) exp {aF (a1, Te , &n)} 
{ao} 
(45) oo oo ’ k ° : 
7" “; l i | S,(2, Pis*** » Pk) I] = ~ <) _ don) 
— 30 lL 0 m=1 Pm 
where 
Sr(X, pi,°**, pk) = > Pa(or, °** 5 @n) 


{ae} 


k 
exp {F(a 125) On) — iQ, pigi(ar, a oe a) 
]= 


fun 


anc 


TH 


the 
(4€ 


wk 


an 


~ ee OO we Oo RS OC 





MARKOFF CHAINS 33 


“When F (a1, -°-+,a@,) and {g,(a:,--+ ,an)} are all additive or multiplicative 
functions of the form (13a) and (13b), say 


n—1 
F(a, ++ >On) = ay h(a, , Oy,41) 


n—1 
gi(a1, ee Qn) — a Gj(Ox , Oti4-1) 


and the probability chain is a simple one, Z,(2) reduces to a simple form. 
Suppose 


n—1 
P,(a, pee Qn) = a p(a;, 0541) 
1= 


then following the derivation of (24), we have 


v 


(46) S,(z, Pais °°" 5 px) = Z {Nae.p}” (Pieip° 1) (1° Vi,2.9) 


l=1 
where Xj,2,9 , Pz,z,. and V;,,,, are characteristic values and vectors of the matrix 


Pze{1, 1) - + > peo(l, v) 


Drp(v,1)--- Dz p(V, v) 
and 


Dz.p(Q, B) sa p(a, 8) exp {xh(a, B) m% D p19; (a, B)}. 


Substitution of (46) into (45) allows one to calculate Z,(2). 


5. More Complicated Chains. In a chain of N events in which the result of 
each event depends on those of its n predecessors (n << N), the calculation of 
Z,(x) proceeds in essentially the same manner as in the case of a simple chain. 
Let the N events be divided into N/n sets of ‘‘grand events” of n simple events 
each (for simplicity we assume N is divisible by n, this can easily be avoided). 
Thus, if each simple event could lead to any one of v possible results, a grand 
event could lead to any one of v” possible results and a complicated chain becomes 
a simple chain of grand events with the result of each grand event depending on 
the preceeding grand event. Quantitative calculations thus proceed formally in 
the same manner as in a simple chain. 


6. Continuous Case. In this section we generalize, by studying an example, 
to the case in which each event in a simple chain may lead to any one of a con- 
tinuum of results. The example is a problem arising in statistical mechanics of 
molecular chains. 

Consider a linear chain of n identical molecules whose centers of mass remain 
at a set of fixed regularly spaced positions, but which may rotate about their 





34 ELLIOTT W. MONTROLL 


centers of mass in a plane. Suppose, that the potential energy of interaction 
between neighboring pairs of molecules is a function of the angles a specified 
axis of the molecules makes with the line connection the centers of mass of the 
molecules; that is, the potential energy of interaction between pairs of adjacent 
molecules can be written as V(0;, 0;:). Assuming that forces are sufficiently 
short ranged for interaction between more distant neighbors can be neglected, 
Boltzmann’s theorem states that the probability of the axis of the first molecule 
making an angle between 6, and 6; + a6, with the line of centers of the chain, 
the second between 6, and 62 + a6. and the nth between @, and 6, + dé, is 
proportional to 


exp [—kT {V(A : 62) a V(6. ’ 63) a “= oe V(On-1 ; 6,)}] dé, oe dé, 


where k& is Boltzmann’s constant and T is the absolute temperature. The 
contribution of the interaction to the thermodynamic properties of the chain 
can be derived from the partition function 


Qe 2s 2x 
Zn i | | = I 
0 0 0 


exp {- iF [V (01, 02) + +++ + V(On-1, ol d0, +++ dn. 


(47) 


For example, the internal energy is 


E = 0 log Z,/0(—1/kT) 
and the specific heat is c = dE /dT. 


It is to be noted that Z, is exactly the integral of the iterated kernel of the 
integral equation 


(48) (0) = I ~ y(0e) exp {- Vs, 0) | dy 


If V(6, , 62) is symmetrical in 6, and 2, this linear homogeneous integral equation 
has a set of orthonormal characteristic functions {y;(@)} such that 


(49) [vO a0 = on. 


To each of these characteristic functions there corresponds a characteristic value 
;. Now it is well known that the kernel of (48) can be expanded as a series in 
its characteristic functions 


exp {- a V6. , 2) - X Aj Wi(O1)p; (02). 


Introduction of this expression into (47) and applying the orthogonality condis 
tions (49) one obtains 


(47a) = D4 [wo aoh 





MARKOFF CHAINS 35 


‘ Probably the most interesting example of a molecular chain of the type 
described above is a chain of magnetic dipoles which are restricted to rotate only 
ina plane. In that case 


V(6;, Oj41) = “; [cos (0; — 841) — 3 cos 6; cos 04411. 
Where uz is the magnetic moment of each dipole and r is the distance between a 


pair of adjacent centers of mass. This potential function leads to the integral 
equation 


AW(O;) = I ” H(6.) exp {- app les (0; — 62) — 3 cos 6; cos a} dO. . 


Since this equation is rather complicated to solve, we shall devote the rest of the 
section to a potential function of less physical interest, but which leads to a less 
formidable integral equation. 


In studying hindered rotation of molecules, one sometimes uses potential 
functions of the form: 


V(0;, 0541) = —B cos (0; — 4541) 
where 8 is a constant. With this potential function (48) becomes 


Qe 
(50) AV) = I ¥(6s) exp {J cos (6; — 62)} dB. 


where 
J = B/kT. 


The characteristic functions and characteristic values of (50) are easily found 
with the aid of the Fourier Series for exp (J cos 6): 


(51) exp (J cos 0) = In(J) + 22. In(J) cos m 6 
m=1 
where /,,(J) is the mth Bessel function of imaginary argument: 


- rs) (2 Yaa 
In(J) = 2 (m + k)tk! ° 
From (51) 
exp [J cos (6; — 62)] = In(J) + 2>> Imn(J)(cos m6, cos mb, + sin m6, sin méb.). 
m=1 


Substituting this expression into (50) we have 


AV(6,;) = I (62) ‘ oJ) + 2 a Im(J)(cos mO; cos m2 +- sin mé; sin ma dé. 





36 ELLIOTT W. MONTROLL 


Because of the orthogonality of the trigonometric functions, one can verify by 
direct substitution that the characteristic functions are 


¥o(8) = 1/(2n)? 
V0) = wr? sin md; @ — g++ cos md, (m = 1, 2, ---) 
and the corresponding characteristic values are 
No = 2rly(J) 
AS? = rxS? = al .(J) / m>0o. 
Introduction of these characteristic functions and values into (47a) we obtain 
the simple formula for the partition function: 
Zn = 2w{WrIo(J)}"—. 
The internal energy of the molecular chain is therefore 
E = 0 log Z,/0(—1/kT) 
= —B(n — 1) L(J)/I(J), 
and the specific heat is: 
C = dB/aT = hk(n — Dri + a - of POT} ; 
REFERENCES 
[ 


. Marxorr, Wahrscheinlichkeitsrechnung, Leipzig, 1912. 


1] A 
[2} S. Bernstern, “Sur l’extension du théoréme limite du calcul des probabilités aux 


sommes de quantités dependantes,’”’ Math. Ann., Vol. 97 (1927), p. 1. 

[3] M. Frecutt, Recherces Theoretiques Moderns sur La Theorie des Probabilites, Vol. 2, 
Paris (1937). 

[4] H. Kramers AND G. WANNIER, ‘‘Statistics of the two-dimensional ferromagnet: Part I,” 
Phys. Rev., Vol. 60, (1941), p. 252. 

[5] E. MontROLL, ‘‘Statistical mechanics of nearest neighbor systems,’’ Jour. Chem. Phys., 
Vol. 9 (1941), p. 708; Vol. 10 (1942), p. 61. 

[6] E. Lassetrre anp J. Howe, ‘‘Thermodynamic properties of binary solid solutions on 
the basis of the nearest neighbor approximation,’’ Jour. Chem. Phys., Vol. 9 
(1941), p. 747. 

[7] J. ASHkKIN AND W. E. Lamp, ‘‘The propagation of order in crystal lattices,’’ Phys. Rev., 
Vol. 64 (1943), p. 159. 

[8] L. Onsacmr, “Crystal statistics I. A two-dimensional model with an order-disorder 
transition,’’ Phys. Rev., Vol. 65, (1944), p. 117. 

[9] M. Hostinsxy, Methodes generales du Calculu des Probabilites, Paris, 1931. 

[10] H. Cram&r, Random Variables and Probability Distributions, Cambridge Univ. Press, 
1937, Chap. 4. 

[11] G. Fropentus, ‘‘Uber Matrizen aus positiven Elementen. II.,’? Preuss. Acad. Wiss. 
Sitz., (1909), p. 514. 

[12] A. Tospotsky, PowE.t anp H. Eyring, an article in Chemistry of Large Molecules, 
Interscience Publishers, 1943, pp. 156, 182. 

[13] M. Kac, ‘‘Random walk in the presence of absorbing barriers,’”’ Annals of Math. Stat. 
Vol. 14, (1945), p. 62. 





ON THE FIRST TWO MOMENTS OF THE MEASURE OF A 
RANDOM SET 


By L. A. SAnTAL6é 


Universidad Nacional del Litoral, Argentina 


1. Introduction. Ina recent paper [3] H. E. Robbins derived general formulas 
for the moments of the measure of any random set X, and applied the formulas 
to find the mean and the variance of a random sum of intervals on a line. In 
subsequent papers, J. Bronowski and J. Neyman [1], using other methods, found 
the variance when X is a random sum of rectangles in the plane, and H. E. 
Robbins [4] found the variance when X is a random sum of n-dimensional 
intervals in n-dimensional euclidean space. In the latter paper Robbins 
solved also the corresponding problem for circles on the plane. 

Using the methods of Robbins, our purpose in the present paper is to solve the 
following similar problems: 

(i) Let R denote the rectangle consisting of all points (x,y) such that0 < « < Aj, 
0 <y < Az, and let R’ denote the larger rectangle for which —5 < x < A, + 6, 
—5 <y < Az +6. Let p denote a rectangle of fixed dimensions, a X b, but 
variable position in the plane. The position of p will be determined by the 
coordinates x, y of its center P and the angle ¢ between the side of length a and 
the x-axis. We suppose (a° + b’)? < min (A, , 42,6). Leta fixed number N of 
rectangles p be chosen independently with the probability density function for 
the coordinates (x, y, ¢) of each rectangle constant and equal to 3 w R’ in the 
three-dimensional interval with base R’ and height a and zero outside this 
interval. In section 3 we evaluate the first two moments of the measure of X, 
where X denotes the intersection of the set-theoretical sum of the N rectangles 
p with R. 

(ii) Let R denote the n-dimensional interval consisting of all points (a, 2, 
23, °** Xn) such that 0 < 2; < A;, (¢ = 1, 2, --- ,n), and let R’ denote the 
larger interval for which —6 < 2; < A; +6. Leta fixed number N of n-dimen- 
sional spheres with radii r (such that 2r < min (A; , 25)) bechosen independently, 
with the probability density function for the centre of each n-sphere constant 
and equal to 1/R’ in R’ and zero outside this interval. Denoting by X the 
intersection of the set theoretical sum of the N n-spheres with R, we evaluate 
in section 4 the first two moments of the measure of X. This problem is a 
generalization to n-dimensional space of the case considered by Robbins for the 
plane (n = 2) in [4]. 


2. Preliminary formulas. Let K be an indeformable plane convex figure of 
variable position in the plane. The position of K may be determined by the 
coordinates (x, y) of a point P fixed within K and the angle g which measures 
the rotation of K about P. We shall call xz, y, g the coordinates of K. The 

37 











38 L. A. SANTALO 


measure of a set of figures congruent with K is defined as being the integral of the 
differential form 


(2.1) dK = dxdydg. 


It is readily shown that this measure does not depend on the particular point P 
chosen to determined the position of K[5]. For instance, the measure of the 
set of figures K, each of which contains in its interior a fixed point Q, has the 
value 2 xF, where F denotes the area of K; that is, 


(2.2) dK = 2rF. 
QeK 

Let P; and P2 be two fixed points and let / be the distance P;P,. The measure 
of the set of figures congruent with K, each of which contains both points P, 
and P» in its interior, will be a function of K and l, say w(K, 1). If d is the 
diameter of K, that is, the maximal distance between two points of K, we have 
u(K,l) = Oforl > d. 

Examples. Let K be a rectangle p of fixed dimensions a X b, and let us 
suppose a < b. The diameter d of pis d = (a? + Bb’). Let P(x, y) be the 
centre of p and ¢ the angle which forms the side of length b with the segment 
line P,P, of length 1. If we keep first ¢ constant, then in order that there exist 
positions of p in which it contains the segment line P;P2 in its interior it is neces- 
sary that 


a—lsing > 0, b—leosg > 0 
and in this case the area covered by the centres P in all these positions has the 
value 
(a — lsing) (b — leosg). 


Integrating over all permissible values of y, we obtain 


arcsin[a/l], 


(2.3) ti wa / (a — I sin ¢)(b — 2 cos g) de 


arccos[b/1}, 


where we define 


, «zifzsl 
(zi; = : 
lifz>1l. 
Carrying out the obvious integration in (2.3) we have 
(2rab —4U(a+b) +20 forl<a<b 
4(ab are sin (a/l) — 3a’ — bl + O(P — a’)*) 
(2.4) u(p, l) = fora<Il<b 
| 4(ab arc sin (a/l) — are cos (b/l) + B(? — a’)? 


\+ a(? — 0)? — 3(¢ +B) — 32) fora <b <1. 





MEASURE OF A RANDOM SET 39 


As another example, let R be the rectangle consisting of all points (x, y) such 
thatO << A1,0< y < A:2and let R’ be the rectangle consisting of all points 
(xz, y) such that 


—6<2<Ait+i, —i<y< A. +4, (a +d)! < min (Aj, Ao, 8). 


Let us consider the set of rectangles p whose centers belong to R’ and do not 
contain either P; or P2;, P; and P» being two fixed points which belong to R. 
Let 1 be the distance P;P.. According to (2.2) and the definition of u(p, J) 
the measure of the set of rectangles p under consideration is 


(2.5) 2 rR’ — 2.2 rp + u(p, J), 


where R’ = (A; + 24) (Az + 25) and p = ab. 

Let K be a plane convex figure of fixed position in its plane. Let us suppose 
K to be translated a distance / in the direction 6, and let F(Km, l, 6) be the area 
of the intersection of K with the translated figure. Obviously if d is the diameter 
of K, F(K, 1, 6) = 0 for! >d. In what follows we shall consider the function 


Qe 
(2.6) ®(K, 1) = I F(K, l, 0) dd. 


Example. Let K be a rectangle R of sides A;, A2. Let the symbol [2], as 
in [1], be defined by 


cifx >0 
[7= | 
Oifx <0. 
It is then readily seen that 
(2.7) F(R, l, 6) = [Ay — lsin 6] [Az — 1 cos 6]. 


For our purpose the case in which 1 < min (A;, Ag) is of interest. In this case, 
carrying out the immediate integrations, we obtain 


(2.8) ®(R, 1) = 24 AyAy — 4U(Ay + As) + 20. 


Let S,,. be an n-dimensional sphere of radius r. S,,- will denote also the 
volume of this sphere, that is, as is known, (see [2, p. 109]), 


(ar?)n/2 
(2.9) eal Pe.” 
= r{-~+1 
2 
Let us call the measure of a set of spheres S,,, the measure of the set of their 
centers. That is, if the point P(x; ,x2,--+- ,%n) is the center of S,,,, the measure 


of a set of spheres S,,, equals the integral extended over the set, of the differential 
form 


(2.10) dP = dxidx>_ eee din ‘ 





40 L. A. SANTALO 


For instance, the measure of the set of spheres S,,,, each of which contains a 
fixed point Q in its interior, has the value 


(2.11) / dP = 8, 
QESn, 


where S,,,, is given by (2.9). 

The measure u(S,,,, 1) of the set of spheres S,,,, each of which contains 
totally in its interior a segment of length l(/ < 2r), equals the volume of the 
intersection of two-spheres S,,,, whose centers are placed at the end points of the 
given segment. That is, u(S,,, 1) equals twice the volume of the spherical 
segment of an n-sphere of radius r and semiangle a = arc cos (I/2r). We will 
represent the volume of this spherical segment by S,,,,(a@) and it may be calculated 
in the following way: The intersection of the n-sphere with a hyperplane at a 
distance x from the center is an (n — 1)-dimensional sphere of radius r’ = 
(? — x)*, Let Sn-1,r- denote the volume of this (n — 1)-dimensional sphere 
(given by the general formula (2.9)). The volume of the spherical segment, 
whose base has the radius h = r cos a, will be 


i tik | Son te. 


Putting x = r cos 6 and substituting for S,_1,,- the expression given in (2.9), 
we obtain 


gD yn 


S,,r(a) = (ez)! sin” 6 d@ = rSy-1,r [ sin” 6 dé. 
. 2 


Consequently we can write 
(2.12) u(Smr yl) = 2Sue(a) = 2rSy—ae | sin” 6 dé, 
0 


where S,_1,, is the volume of the (n — 1)-dimensional sphere of radius r and 
a = arc cos (1/2r). 
In (2.12) we may substitute 


= ng a . (W—1Iln—3)e- 31 
i sin” 6 dd = "os py: ae are cos (l/2r) 


n Ll 1( 7 - ‘ (n oan 1) : _ C yo 
2r \n 4r? n(n — 2) 4r° 


+ (n — 1)(n — 3) +++ 3.1 ¢ : a 
: oe ae a 





MEASURE OF A RANDOM SET 


for n even, and 
se (n — 1)(n — 3) +++ 4.2 l ( 2 \e-ve 
sin” ogg TNR — 9) 2° St ae, LE 
[ wesiiiies n(n — 2)... 3 or 1 472 


n—1 ae (n —1)(n — 3) +--+ 4.2 
nt ew n(n — 2)--- 5.3 


(2.14) 


for n odd. 
In particular, for n = 2, 3 we have 


(2.15) p(So,,,l) = ar? | sin’ @ d@ = 2r’ arc cos (1/2r) — 5 Har — ff 
0 


(2.16) u(S3,r 1) = Qa | sin 6 d@ = a —arl + ai nl’. 
0 3 12 


We shall now generalize the formula (2.8) to n-space. 

A direction in n-space may be given by the corresponding point on the surface 
of the n-dimensional sphere of unit radius, that is, by the end point of the radius 
which is parallel to the given direction. The parametric equations of the 


n-sphere >, & = 1 are 
1 


1 COS $1 
2 SIN ¢g; COS ge 


&3 SIN ¢; SIN g COS v3 


= SiN g SIN go *** SIN Pn—e COS Gp—1 
En SIN ¢) SIN ge -** SIN Gas SIN Gn-1 , 


where 0 < 9; < wforz? <n — 1landO < gs <2. The element of area of 
this n-sphere has the value (see, [2, p. 109]) 


(2.18) do = sin” gy sin” g *** Sin gps dgidg. *** dena. 


A direction in n-dimensional space may then be given by the n — 1 parameters 
Pl» $2, °°" »Pn-1- 

Given the n-dimensional interval R consisting of all points (a , v2 , 23, °** , Xn) 
such that 0 < 2; < A; (¢ = 1, 2,3, --- n), and suppose that R is translated a 
distance l(l < min (A;, Az, Az, --+ , An)) in the direction (g1 , g2, +++ , Gn-1), 
the intersection of the translated interval with RF is a new interval whose volume 


has the value II (A; — x,), where x; = 1é, (&; given by (2.17)). 
1 











42 L. A. SANTALO 
Our purpose is to evaluate the integral 


(2.19) ®(R, 1) = / I] (4: - 2) de 


extended over the surface EZ, of the n-dimensional sphere of radius unity. We 
shall denote by E,, either the surface of the m-dimensional sphere of radius unity 
or its area, given, as is known [2, p. 110] by 

— 


(2.20) 3”) 


Because of the symmetry, the coefficients of all the products A4,Ai,Ai, e°* A 
have the same value 


tak 


ar= (—1' | U1%2 °°? My de. 
En 


The integral extended over the whole surface EZ, equals 2” times the integral 
extended over the portion for which £; > 0. Hence, taking into account (2.17) 
and (2.18) we get 


w/2 a/2 
kek 7k n+ ,—3 ° +k—5 
a, = (—1)'2TE,_-, ee | sin"***¢, cos ¢; sin" * “ge cos go 
0 0 


(2.21) -++ sin” "yg, cos oy, do, dy2 +++ dye 


KE n-x 
(n+k—2)\(n+k—4)---(n+k — 2) 


pork =1,2,---,n—1. Fork =n we find that 


/2 /2 
* 2n—3 
(aye | | sin” “y; COS ¢1 
0 0 


al ea 


An 


(2.22) +++ SIN Gn—1 COS Gn-1 Ay; dye +++ Agn-1 
sf 


= (2n — 2)(2n — 4) --- 4.2 ° 


Hence, we have the following general formula 


. Tr 
#(R,l) =Arda «++ AnEn +(—1)" py Sd 


n—1 
(2.23) + a (-1)"( QO Aids, +++ Ang) 
p= t1,t20°**stn—k 
NB. 
(n+k—2)(n+k—4).---(n+k — 2k) 











MEASURE OF A RANDOM SET 43 


In particular, for n = 2 this result coincides with (2.8). For n = 3 we have 
&(R, l) = 47A,AsA3 = fr — 2nl(AyAo oe A,A3 ad AsAs3) 


(2.24) 2 
+ §(Ai + Az + As). 


3. First problem. We can now solve the first problem (i) stated in the intro- 
duction. Denoting by the same letters either sets or their measures, we consider, 
as in [1] and [4], the set Y of points of R that do not belong to X. We have 
;dentically: 


(3.1) X+Y=_R. 


The general method of Robbins [3] taking into account (2.2), gives immediately 
the first moments 


32) EY)=R (1 . gy, E(X) =R {1 i ( ss ey}, 


where R = AjA2 ; R’ = (Ay ao 25) (As ad 25), p= ab. 

Our remaining problem is that of evaluating the second moment of X. Let 
2: ,Yi,¢i (4 = 1,2,3,--- , N) be the coordinates of the N rectangles p (section 2) 
and let us put, as in (2.1), dp; = dxidyidg;. Let P(x, y) and Po(2o , yo) be two 
points which belong to FR and let us put dP = dz dy, dP) = dadyo. Let us 
consider the following multiple integral 


at dP dP» dp dp2 == dpn 
(3.3) peft (Qak’)¥ 


extended over the sets of rectangles p; (congruent with p) such that 2; , y; belongs 
to R’,0 < o; < 22, and do not contain either P or Po. That is, the domain of 
integration of J is defined by 


—-§< 4%; Ai + 4, —6 Sy: Sf Ar+6, OS g > 2, 
PeR, Poe R, P épi, Po € pi, (¢ = 1,2,--- ,N). 


(3.4) 


In order to calculate J, we can first keep the rectangles p; fixed; the points P 
and Py can then vary independently over the set of points Y. That gives 


Y° dp, dpe -++ dpy 72 
(3.5) J = | —aky? E(Y’). 
(zug) ER! 
We can now reverse the order of integration, an operation which is obviously 
justified in this case. Keeping P and Py fixed, we can vary each rectangle p; 
over the set of positions in which it does not contain either P or Pp ; letting l 
denote the distance PP) , we have, according to (2.5), 


(3.6) i: | ¢ ” ‘0 ed) ¥ dPdP. 


PeR,PoeR 





44 L. A. SANTALO 


In order to evaluate this integral we divide it into two parts J = Ji + Ja, 
according as0 <1 < dord <1 < D,whered = (a? + b*)'and D = (A? + A})?. 
In the interval 0 < 1 < d we introduce the new variables of integration l. 6 
related to x, y, Xo , Yo by 


(3.7) 2% = x+1cos 8, Yo=y+lsin 6 
whence 
A(X, Y, Lo, Yo) _ ) 
A(z, y, l, 8) 
In terms of the new variables we have 
I = | (1 _ tpn.) L dl dP dd. 


In this integral the point P can vary over the intersection of R with the figure 
obtained by translating FR a distance / in the direction 6; that is, the integration 
of dP gives the function F(R, l, 6) defined in section 2. According to (2.6) we 
therefore have 


ad f N 
i _ 4rp — p(p, ») a(p 
(3.8) hn i (1 ee) a(R, DL dl, 


where y(p, /) is given by (2.4) and &(R, l) by (2.8). 


In order to evaluate J. we observe that in the intervald <1 <D u(p,l) = 0 
and we have 


N . 9 N (0< l<d 
Jo = (1 _ 7) | dP dP» at ) < | ar dP» —_ | dP ar.} . 
dsl<D is O<i<d 
Further we have 
(3.9) | dP dPo = R° 
O@<l1sD 

and with the change of variables (3.7) and the formula (2.8) we find that 

. . ». & 
(3.10) | dP dP» = | @(R, l)l dl = rAyA,d — 3 (A, + Ae) d’ + ‘ di‘. 

0 . “a 

O<sl<sd 


Collecting (3.8), (3.9), (3.10) and taking into account (3.5) we have 


ed N 
ryt _ 4rp — nlp, Ll) 7 
E(Y*) = I (1 tae — i) @(R, l)l dl 
(3.11) 


N 
+ (1 = a) {R* 7 wA,As a + 3(Ay + Ao) da 7 2s} , 





MEASURE OF A RANDOM SET 45 


where p = ab, R = A,Az, R’ = (A; + 26) (Ao + 28), u(p, 2) is given by (2.4) and 
#(R, l) by (2.8). 
For the variance of X and of Y, we have by (3.1) and (3.2) 


ox = E(X’) — E(X) = E(Y’) — EY) 


_ Arp — u(p, l)\” _ 2p\" 


2N 
- {R° — 2A, Acd” + $(A1A2)d° — 3d*} — R? ¢ ~ £) 

which completes the solution of our first problem stated in the introduction. 

4. Second problem. In order to solve the second problem (ii) stated in the 
introduction we will follow the same method of the preceding section. 

Let X be the intersection of the set theoretical sum of the N n-dimensional 
spheres S,,,, of radius r with the n-interval R. Let us call Y the set of those points 
of R that do not belong to X, that is, 


(4.1) X+Y=R. 


The general method of Robbins gives immediately 


Y) = — Sar" ry ie = (1 — Sar)" 
(4.2) meat a) B(x) = RA =(1 mr) } 


where R = II. A;,R =I] (A; + 26), and S,,,is given by (2.9). 


We now ial to ¢ dali E(Y’). For this purpose let Q:(yi , y2 9°? 
and Q2(yi , ¥2, °°: , Yn) be two points which belong to R and P,(a} » eg *** 9 
be the centers of the N spheres S,,,. Let us put 


(4.3) dQ; = dyidy: --- dy,,(@=1,2), dP; =dajdz} --- dx, (i = 1,2,--- 


Consider the integral 


- 2» , >. 


extended over the domain defined by 
Q¢eR, QeR, Pye R', QPi>17, QP; > 1, (¢ = 1,2,---,N). 


If we keep P: , P2, P3, +++ , Pw fixed, each point Q; , Q. can vary independently 
over the set Y; consequently we have 


72 2 eee 9 
(4.5) J =| =e Ss = E(Y’). 
PyeR R " 


On the other hand, if we keep Q, and Q, fixed, the integral of each dP; gives 











46 L. A. SANTALO 


R’ — 2 Sz + w(Sn,,, 1) where u(Sn,,, 1) is given by (2.12) andl = Q,Q,. 
Hence we have 


N 
(4.6) s=f y — Mae — wae!) 40, ads. 
Q1€R,Q2eR R 


In order to calculate this integral we split it into two parts J = Ji + Je, 
according as 0 < 1 < 2ror 2r < 1 < D, where D = (>> A*)*. In the interval 
1 


0 <1 < 2r we introduce the new variables of integration 1, g:, g2,°°* 5 ¢n-4 
related to i, Y2, see Ynys Yi» Y2 > ‘oe Yn by 
(4.7) yi=yit le, = 1,2,--+,n), 
where £; is given in (2.17). It is found that 


a(yi, Yo, 8 Ynys Yas 2; men » Yn) aie , 


oe a nl ‘ 
a . sin” “ g sin” go +++ SIN gn—e- 
O(4i, Ye» 299, Uns by Ply +++ Ont) 


Hence we have, 
(4.8) dQidQ. = 1" dQ,dedl, 


where do denotes the element of area of the n-dimensional sphere of unit radius, 
given by (2.18). The same method used in section 3 gives 


2r N 
(4.9) Ji = I ( ow Aer — pire D) ®(R, II" dl, 
0 y 


where ®(F, l) is given by (2.23). 
In the interval 2r <<1< D_ u(S,,, 1) = 0 and we have 


28nr\" (2S an\" 
” - (: 7 h’ ) - dQidQe cs ( = R’ ) 


-_ a 7 . Qs 402}. 








(4.10) 


Now we have 
(4.11) [dQ dQ =F 
0</<D 
and with the change of variables (4.7) we readily find that 
2r 
(4.12) i dQ; dQ. = I #(R, DI" dl. 
0<l<2r 0 


Collecting (4.9), (4.10), (4.11), (4.12) and taking into account (4.5) and 
(2.23) we have 








MEASURE OF A RANDOM SET 


2r teu N 
E(Y’) = I ( - Bee — eGneD) #(R, 1)" dl 





kK 
BY ia °°" 
1 po ° ae ore _— n 
+( R’ ) ik n om te 2n(2n — 2) «++ 4.2 
(4.13) a 
7 ay (- 1)* ( a Aah en A;,-s) 
q, te." *etae 


n+2k n+k 
3 Enz y 


ee a 





where R = I A;,R’ = [[(A; + 28); S,., is given by (2.9), En by (2.20), u(Sa,r, 2) 
1 


by (2.12) and ®(R, l) by (2.23). In particular, for n = 2, we obtain the value 
given by Robbins [3, (30)], by use of (2.8), (2.15) and the equations So, = mr’, 
E; = 2. For n = 3, the case of ordinary space it follows from (2.16), (2.24) 
and the equations S;,, = $ ar, E; = 4 7, E, = 2 7, that 


2r 3 2 3\ N 
E(Y’) =| ( it wae) (49 — f — 2x(A, As + Ar As 


+ A2A3)l + = 5 (At + Aot+ Ast) dl + (1 - =) 1 
(4.14) 
_ = aRr® + 8r(Ai A; Az A3 + A» A;)r* 


“- = (Ay + Az + As)? + +2 rt. 


In this case the exact evaluation is easy if one expands the binomial under the 
sign of the integral and integrates term by term. 

From (4.1) we see that ox = E(X’) — E°(X) = E(Y’) — E’(Y). Thus, 
from (4.2) and (4.13) we obtain immediately the second moment E(X’) and the 
variance ox of X. 


5. Remark. In the second problem we can substitute the n-intervals R and 
R’ by concentric n-dimensional spheres. The problem may then be stated as 
follows: 

Let Sn,q denote a fixed n-dimensional sphere of radius a and S,,44s the con- 
centric n-dimensional sphere of radius a + 6. Syz,q and Sy,a+4s shall also denote 
the corresponding volumes. Let a fixed number N of n-dimensional spheres 
with radii r (r < min (a, 6)) be chosen independently with the probability density 
function for the center of each S,,, constant and equal to 1/S,.04s in Sn,ais 
and zero outside this n-sphere. Let X denote the intersection of the set-theo- 
retical sum of the N n-spheres with S,,.; we wish to evaluate the first two 
moments of the measure of X. 








48 L. A. SANTALO 
It suffices to observe that in this case we have 
(5.1) ®(Sne, 1) = (Sno, l)En = 24 Sn—1.aEn [ sin” 6 dé 
where S,_1,4 is the volume of the (n — 1)-dimensional sphere of radius a and 


a = are cos (I/2a). 
The same method used in section 4 gives 


| - an ad 
(5.2) E(Y) = Sn. ( is ie , EX) = Ss — ¢ os red i 
or i N 
E(Y’) = | (1 ss Ane = wBarD) ®(Sna, 1)" dl 
0 Snats 


2Sn r P 2 ” n—1 } 
-— n eo ® Sn a) ’ 
+ (1 3) is | | (Sno, DI" dl 


where ®(S,,4, l) is given by (5.1). 
In particular, for n = 2, by use of (5.1), (2.15) and the indefinite integrals 


(5.3) 


[ are cos (1/2a)l dl = (41° — a’) are cos (1/2a) — 3 (4a — f)* + constant, 


/ P(4a2 — P) dl = —4U(40? — 2)! + 40°l(4a? — 2)? 


+ 2a‘ are sin (1/2a) + constant 
we find that 


or 2_ 9,2 ‘ 1 2 724 
E(Y’) = an | ¢ _ 2m 2r° are cos (l/2r) + 41(4r i) 
0 


x(a + 8)? (2a° arc cos(1/2a) 


.¥ 9r N 
2 2\3 2 2 r 
— 41(4a° — 1’) jiat + ¢ _ am) {ra — Qn (20 (2r° — a’) are cos (") 
— 3a’ r(a® — 7°) + xa + 2r(a? — r)! — a‘ are sin(r/a))}. 
) 
For n = 3, we have by (5.1) and 2.16) 


‘ pe 16r° + 12772 — P\Y : 
E(Y*) = 4x i (1 os ore) ($4a° — ra°l + dynl*)P dl 


+ 49 (1 _ _2r : tra’ — 32na°r° + 4ra?r* — ar 

(a + 68) \° _ es 

From (5.2) and (5.3) with the use of the relation ¢, = E(X*) — E’(X) = 
E(Y’) — E’(Y) we obtain immediately the second moment E(X’*) and the 


js 2 ‘ 
variance ox of X. 








MEASURE OF A RANDOM SET 49 


REFERENCES 


[1] J. BRonowski AND J. Neyrman, ‘‘The variance of the measure of a two-dimensional 
random set,’’ Annals of Math. Stat., Vol. 16 (1945), pp. 330-341. 

[2] R. DEtTHEIL, Probabilitiés géométriques, Gauthier-Villars, Paris, 1926. 

[3] H. E. Ropsins, “On the measure of a random set,”? Annals of Math. Stat., Vol. 15 
(1944), pp. 70-74. 

[4] H. E. Ropsins, “‘On the measure of a random set, II.’’ Annals of Math. Stat., Vol. 16 
(1945), pp. 342-347. 

[5] L. At Sanraté, “Sobre la medida cinemdtica en el plano”, Abhandlungen aus dem 
Mathematisches Seminar der Hamburgische Universitat, Vol. 11 (1936), pp. 
222-236. 











ON A TEST OF WHETHER ONE OF TWO RANDOM VARIABLES 
IS STOCHASTICALLY LARGER THAN THE OTHER 


By H. B. Mann anv D. R. WuitNry 
Ohio State University 


1. Summary. Letz and ybe two random variables with continuous cumulative 
distribution functions f and g. A statistic U depending on the relative ranks 
of the x’s and y’s is proposed for testing the hypothesis f = g. Wilcoxon proposed 
an equivalent test in the Biometrics Bulletin, December, 1945, but gave only a 
few points of the distribution of his statistic. 

Under the hypothesis f = g the probability of obtaining a given U in a sample 
of n x’s and m y’s is the solution of a certain recurrence relation involving n 
and m. Using this recurrence relation tables have been computed giving the 
probability of U for samples up ton = m = 8. At this point the distribution is 
almost normal. 

From the recurrence relation explicit expressions for the mean, variance, and 
fourth moment are obtained. The 27th moment is shown to have a certain 
form which enabled us to prove that the limit distribution is normal if m, n go to 
infinity in any arbitrary manner. 

The test is shown to be consistent with respect to the class of alternatives 
f(x) > g(x) for every x. 


2. Introduction. Let xz and y be two random variables having continuous 
cumulative distribution functions f and g respectively. The variable x will be 
called stochastically smaller than y if f(a) > g(a) for every a. We wish to test 
the hypothesis f = g against the alternative that x is stochastically smaller than 
y. Such alternatives are of great importance in testing, for instance, the effect 
of treatments on some measurement. One may think of z as the values of 
certain measurements in the control group and of y as the values of the same 
measurement in a group which received treatment. In a particular instance 
the protective effect against infection by certain bacteria was investigated. 
Two groups of rats were used in the experiment. The first group receiving no 
treatment, the second group receiving the drug. Both groups were then infected 
with supposedly equally diluted eultures of the bacteria under investigation. 
Most of the rats in both groups died, but the time of survival was measured and 
it was desired to test whether the drug had the effect of prolonging the life of the 
rats. It was desired to make inferences from the effect on rats to the effect the 
drug would have on humans. Thus, the only relevant alternative to the hy- 
pothesis that survival times are not influenced by the drug is that the survival 
time of those rats which received treatment is stochastically larger than that 
of the control group. 


50 


seemed 





reenter h 








A TEST 51 


3. The U test. Let the quantities 2, ,--- ,2%n, yi, °°: ,Ym be arranged in 
order. This arrangement is unique with probability 1 if P(z; = y;) = 0 and 
this follows from our assumption of continuity. Let U count the number of 
times a y precedes an x. If P(U < U) = a under the null hypothesis, the 
test will be considered significant on the significance level a if U < U and the 
hypothesis of identical distributions of x and y will be rejected. 

This test was first proposed by Wilcoxon [1]. His statistic T is the sum of the 
ranks of the y’s in the ordered sequence of x’s and y’s. In general 


U = mn + Mth _ 


7 
and this gives a simple way of computing U. Wilcoxon, however, treated only 
the case m = n and in this case he tabulated only 3 points of the distribution of 
T. Since the test seems of great utility it seemed worthwhile to compute the 
variance, the moments and the limit distribution of U and to investigate the 
class of alternatives with respect to which the test is consistent. 

Although this paper is written in terms of U and the probabilities of U are 
tabulated the results can be easily interpreted in terms of T if so desired. 


4. The distribution of U. Consider now ordered sequences of » z’s and 
m y’s. Since it is only the relation between x and y that matters we replace 
each x by a0 and each y by al. Let U count the number of times a 1 precedes a 
0. Let pam(U) be the number of sequences of n 0’s and m 1’s in each of which a 1 
precedes a 0 U times. By examining a sequence with the last term omitted we 
arrive at the recurrence relation: 


Pnun(U) — Pn—-in(U os m) + Pum-a(U), 
where p;(U) = 0 if U <0 and py(U), po(U) are zero or one according 
as U # Oor U = 0. 
Under the null hypothesis each of the (m + n)!/m!n! sequences of n 0’s and 


m 1’s is equally likely. Consequently if pam(U) represents the probability of a 
sequence in which a 1 precedes a 0 U times then 


(1) Pum (U) = 


Nn . ; m r 
meee Pn—1m(U — M)+ —- Pnm—1(U). 


Using the recurrence relation (1) the probabilities p,.(U) have been tabulated 
form <n <8 (see TablelI). Form = n = 8 the distribution of U — $(nm + 1) 
differs only a negligible amount from the normal distribution. We shall, in the 
following, derive the mean, the variance, and the fourth moment of U, and 
prove that the limit distribution of U is normal if n and m both approach infinity 
in any arbitrary manner. 

It is obvious that Pam(U) = pPmn(U). 

Since the probability of the 7th 1 preceding the jth 0 is 3, we have 


(2) Enn(U) = nm/2. 











52 H. B. MANN AND D. R. WHITNEY 




































































TABLE I 
Probability of Obtaining a U not Larger than that Tabulated in Comparing Samples of 
n and m 
n= 3 n=4 
‘ei 
IN IN, | 
+ | 1 2 | 23 nl 1 2 3 4 
| . U 
| | | | 
S rreencilll cincessoee asbestos — oh ceentenilleermenisiicnm meee 
| 0 | .250 | .100 | .050 | O | .200 | .067 | .028 .014 
| 1 | .500 | .200 | .100 / 1 | .400 | .1383 | .057 | .029 
2 750 | .400 | .200 | 2 , -600 | .267 | .114 | .057 | 
poston + | erence nanan ————+ 
3 600 350 | | 3 | | .400 .200 .100 
ced aenetnrennofininimnd meena .* 4 | .600 .314 171 | 
| 4 500 | | 
ee | 6 | .571 | .343 | 
| *¥ | 443 | 
| 8 557 | 
n=5 n=6 
Ph nm NN 
| a ct eH aia t | fas 1; 2/]3 {144 {5 1] 6 
, | 2 NJ 
| © | .167) .047) .018) .008) .004| | O .143} .036) .012 005) .002) .001! 
| 1 | .333| .095| .036) .016) .008) 1 -286| .071) .024| .010) .004) .002 
2 | .500] .190| .071/ .032) .016| 2 .428) .143| .048) .019| .009) .004 
| 3 .667| .286) 125) .056| .028 | 3 .571! .214| .083} .033) .015! .008| 
eee = i eee 
oe eae —, | | | | 
} 4 | | .429| .196| .095| .048| t | .321| .131) .057| .026| .013 
5 | .571| .286| .143| .075 | 5 | .429) .190) .086| .041) .021 
coianieneaneal | |} 6 | | .571) .274) .129) .063) .032 
| 6 | | .393) .206| .111| | oncnerelhenel ene eesagaenela ener 
| 7 | .500| .278) .155) 7 | -357| .176| .089) .047 
8 | | 607! .365| .210 8 .452| .238) .123} .066) 
| |__| — lates | 9 | | .548) .305, .165) .090) 
| 9 | | 452) .274| uae nt poeceneinanetens 
|; lo | | .548| .345 | 10 | .381) .214| .120 
| | a sigh at Ng = 11 | .457| .268) - 155) 
lon | 421) | 12 | | | 645} 331] .197 
} 12 | | | .500! [eonieneneeeenanienaeser i aOR ieee een ROE 
| 43 | | 579 | 13 | | .396| .242 
| Fasano | 14 | | | | 465) .294 
| 15 | | | | .535) .350) 
re ail ale E aad ictal 
| 16 ; | | .409 
17 | | | .469 
18 | | 531] 


























.056 
111 
. 167 
.250 


.333 
444 


A TEST 


TABLE I (Continued) 





























Pres 
| 

| .003 .001 
017 006 .003 
033 012 .005 
| .058 021 009 
| -092 036 015 
| .133 | .055 024 
| 192 | — .082 .037 
| .258 | 115 053 
| .333 | 158 074 
| 417 | 206 101 
| 500 | .264 134 
|} .583 | .324 172 
394 216 
| 464 265 
538 .319 
378 
438 
500 
562 





.001 
-001 
.002 
-004 
-007 


O11 
.017 
-026 


.037 
051 
.069 
.090 








.006 
.009 
O13 


O19 
.027 
.036 
O49 


O64 
.O82 


104 


.130 
.159 
.191 


DOS 


.267 
.310 
.o00 
.402 
451 
.500 
.049 









Be wON KS © 














.222 


.044 
.333 | .089 
-444 | .133 
.556 | .200 








. 267 
356 
444 
.556 





ee eee eee SSS... 


H. 


B. MANN AND D. R. 


TABLE I (Continued) 





WHITNEY 


.012 
.024 
.042 
-067 





097 
- 139 
. 188 
. 248 





315 
387 
-461 
.539 











.004 
008 
.014 
.024 


.036 
.055 
.077 
107 


184 
.230 
285 


404 
-467 
.533 











141 | 





341 








n 


= 8 


-002 
.003 
005 
-009 





.015 
.023 
-033 
-047 





.064 
085 
kid 
.142 





ed 


217 


.262 | 


311 | 





362 
.416 
472 
.528 




















-000 


.000 | .000 | 3. 
.001 | .001 | .000 | 3. 
.002 | .001 | .001 | 2.5 
.004 | .002 | .001 | 2.8 











.006 | .003 | .001 | 2. 
010 | .005 | .002 | 2. 
.015 | .007 | .003 | 2. 
.021 | .010 | .005 | 2. 































om 
Co 


ao 
= 
Go 


“J 
w 

















.030 | .014 | .007 | 2.363 | .009 
.041 | .020 | .010 | 2.258 | .012 
054 | .027 | .014 | 2.153 | .016 
.071 | .036 | .019 | 2.048 | .020 
| | 
091 | .047 | .025 | 1.943 | .026 
114 | .060 | .032 | 1.838 | .033 
.141 | .076 | .041 | 1.733 | .041 
172 | .095 | .052 | 1.628 | .052 
207 | .116 | .065 | 1.523 | .064 
.245 | .140 | .080 | 1.418 078 
.286 | .168 | .097 | 1.313 | .094 
831 | .198 | .117 | 1.208 | .118 
.877 | .232 | .189 | 1.102 | .135 
426 | .268 | .164 998 | .159 
475 | .306 | .191 | .893 | .185 
525 | .347 | .221 | .788 | .215 
389 | .253 | .683 | .247 
| .483 | .287 | .578 | .282 
| .478 | .823 | .473 | .318 
| .522 | .360 | .368 | .356 


.399 | 
.439 | 











- Re TEE Oe 


oe 





a tie Sesiensenenl 


EE a rere em 


2s 








A TEST 55 
We now seek an expression for Enm(u’) where u = U — nm/2. After multiply- 
ing (1) by (U — nm/2)’, using 
Enm(u’) = >3(U — nm/2)? pan(U) 
U 


and expanding: 


i oa) 2 m 
(3) Enm(u) = oe En-1m(u) + a 


where Enm(u) denotes the expectation of (U — nm/2) in sequences with n 0’s 
and m 1’s. The initial conditions of (3) are seen by direct calculation to be 


(4) Eno(u’) = Eom(u’) = 0. 


By substitution E,,(u’) = nm(n + m + 1)/12 is a solution of the recurrence 
relation (3) and its initial conditions (4). Hence, it follows by mathematical 
induction that 


(5) Ean(u) = nm(n + m + 1)/12. 


The fourth moment is similarly a solution of the recurrence relation 


Eam—1(u’) + nm/4, 





a 


) 4, _ 
(6) Bam(u’) = n+m n+m 


Enm-1(u*) 
— a (2n?m + 2nm? —n? — m? —nm) 
which is obtained from (1) by multiplication by (U — nm/2)‘ and expansion. 


The initial conditions of (6) are found by direct calculation to be 
(7) Eno(u‘) = Eom(u‘) = 0. 
It may be verified that 


(8) Enm(u‘) = me er (5n?m + 5nm? — 2n? — 2m? + 3nm — 2n — 2m) 


satisfies the recurrence relation (6) and its initial conditions (7) and hence (8) 
follows by mathematical induction. 

To investigate the limit distribution of u as n, m become infinite we investigate 
the rth moment. Following the same procedure as in the case of the second and 
fourth moments and using the symmetry of the distribution to find the odd 
moments zero we get the following recurrence relation. 


r 1 = 2r\1 2a 2r—2a 2a ?r—2a 
C H - = aaimamoeiieaia a 4 
(9) Enm(u") = = + 3 (2) tm E,-im(u"*) + mn™ Enma(u™ ™)} 


For r = 1, 2 it is known that Eam(u”) is a polynomial in n and m of degree 3r 
and that it is divisible by nm(n + m+ 1). Assuming that Enn(u),a <r 
is a polynomial in and m of degree 3a divisible by nm(n + m + 1) we will 








56 H. B. MANN AND D. R. WHITNEY 


show that it is possible to find a polynomial of degree 3r in n and m divisible by 
nm(n + m + 1) which satisfies the recurrence relation (9) for Enm(w”) and 
also its initial conditions, namely, Epo(u”) = Eom(u") = 0. 

The last condition is trivially satisfied if Ean(u’) is divisible by nm(n + m + 1), 
Our method here is to actually substitute a polynomial with undetermined 
coefficients into (9) and show that the coefficients can be obtained uniquely, 
Rearranging (9) we obtain 


2r n 2r m 2r 
- Enn(u ) = ae ; n+m Enm—1(u ) 
ae r cs 2a 2r—2a 2a 2r—2a 
oe » = 7 {nm™ En-im(u' ) + mn™ Egma(u” ~“)} 


° : 2h 3A—3 3A—3 - 
Since for \ < rwe can write Enn(u”) = nm(n + m + 1)P*%,° where Ps? i 


a polynomial in n, m of degree 3\ — 3 the above equation reduces to 








(11) Enm(u™) ail n : ™ En-1m(u”) = n 3 m Enm-1(u") sa nmQyn 
where Q%,, is a polynomial in n, m of degree 3r — 3. 
Now let 
3r—3 
Enm(u") = nm(n + m + 1) ie a;;n*m? 
itjear—s 


where a;; = aj; are to be determined. Substitution in (11) yields: 


Dd ail(n + m+ 1)n'm? — (n — 1)(n —1)'m’ — (m — 1)n*(m — 1))] = Q5* 


and rearrangement yields: 


3r—3 i ° 
(12) 2 aij | nm’ +>" : ') (—1)°*(n' m* + ntm’)| = Qaim 
7,7=0 a=( 
i+j <3r—3 


Consider first the terms of degree 3r — 3. In this casez + 7 = 3r — 3 and 
a = 7will give 


3r—3 
pm Gisr—s—i[n* - — (i + 1)(n*—** mi 4 n' m***-*)] 
i=0 


or 
3r—3 _ : 
(13) 3r >» Gi3sr—3-i n'm adie? 
t=0 


Equating the coefficients of these terms of degree 3r — 3 to the corresponding 
ones in Q*.,’ it is possible to calculate the value of aisp_3_; , (¢ = 0, --- , 3r — 3). 


We assume now that the a;; are known for i + 7 > 3r — 3 — (k — 1) and 








iS 


d 


A TEST 57 


“we will find the value of a;; where i + j = 3r —3 — k. Consider then the terms 


in (12) of degree 3r — 3 — k. These terms will occur when 

itj=3r-—3,a=1—kj3tt+y =3r—4,a=1-—k+1;-: 
i+j=3r-—3-—k,a =i 

All, but the last, contain coefficients which have already been evaluated. The 


last one reduces to 


3r—3—k 
(3r — 3) Dy Giss.—in'm” ~*~, 
i=0 


Thus by equating coefficients ajg,3-.-; for i = 0, 1, --- 8r — 3 — k can be 
evaluated in terms of the coefficients a;; already canes and those in Q*y,’. 
This concludes the proof that E,,,(u”) is a polynomial in n, m of degree 3r and 
is divisible by nm(n + m + 1). 
We now investigate the coefficients of the terms of degree 3r.. For A = 1,2 
2) _ (2 — 1) +++ 5-3-1 
a es 


We assume this to hold for \ < 7 and we will show that it holds for\ = r. Sub- 
stitution reduces the right side of (10) to 


1 2r\ 1 2 (2r -_ 3) cee 5-3-1 r—1 mM r—1 
tulsa [EE On — tn + my | 


(nm)\(n + m + 1) + terms of degree < 3d). 


nm( 


+ mn qe Ss n’*(m — 1) "(n+ my | + (terms of degree < anh 
or 
=~ r(2r — 1) my? | 27a 3) 2: os] 


[n(n — 1) mm" + m(m — 1) 'n""] + (terms of degree < 3r — »} 


which reduces to 
3r(2r —1)--- 5-3-1 
12” 
Comparison of coefficients with (13) multiplied by nm gives 


3r—3 - —_ . —_ eee 3-1 
nm > Giar—3-i a’ae ** «x (2r — 1) +++ 5-3-1 
10 12° 


‘1 4 (terms of degree < 3r — 1). 





(nm)'(n + m)™ 
or 


By (ui) = OF = 1) +++ 5-3-1 


(14) ( > (nm)'(n + m + 1)’ 


+ (terms of degree < 3r). 





58 H. B. MANN AND D. R. WHITNEY 


We now wish to show that En»mn(u”) is at most of degree 2r in n or m. For 
r = 1, 2 this has already been established. Assuming that it is true for lower 
moments the right side of (10), which reduces to nmQ%,,* is at most of degree 
2r — linn. We again compare coefficients in (12). First, for terms of degree 
3r — 3 we have already seen that n has degree at most 2r — 2. For terms 
of degree 3r — 4weusei + j = 3r —3,a=71—1landi+j = 3r —4,a =i. 
The first case gives rise to no terms in n of degree greater than 2r — 2 so when we 
solve for the coefficients a,3,4~; the coefficients of terms in n of degree greater 
than 2r — 2 must be zero. The process repeats and we find no terms in n or m 
of degree greater than 2r — 2 in the left side of (12). This gives Enn(u”) at 
most the degree 2r in n or m. 

Now consider the ratio 


Enm(u’) 
[Enm(u?)]? 


(2r — 1) +++ 5-3-1 (nm)'(n + m+ 1)’ 


r= 


12° 
[nm(n + m + 1)/12)" ~ 





(terms of degree < 3r;in n or m, < 2r) 
[nm(n + m + 1)/12} 


_ (terms of degree < 3r; in n or m, © 2r) 
Or ee Ema Fm LY 





Hence 

(15) Lim J = (2r — 1) --- 5-3-1 

and by a well known theorem it follows from (15) that the limit distribution is 
normal. 


5. Consistency of the U test. If f and g are the cumulative distribution 
functions of the x’s and y’s then our null hypothesis is f = g. The alternatives 
admitted are f(a) > g(a) for every a. Let £4 denote the expectation under the 
alternative. 

Defining 


Oifa; < y; 
Gj : 
lifa;> 9; 
we have 


Exes) = Pe > vs) = [ gdf <? 


Ex(tita) = Pls > uim>w =| gaf<d 


E,(tintjn) = Pi > ye, 2%; > yx) = [ (1 —f)?dg <3 





“We can now write 
E.(zij) = 3 — A, Ea(zita) = 3 —- 44, Eu(tata) = 4 —- & 


where A, €1 , €2 are positive numbers. 
We have then 


c(t) =}— oa(tita) = py —atrA—N 


O4( 15 x2) = 0 fori + kj x l Oa (Len sx) = ty = + A- »? 


Now 
(16) E,(U) = 2) Ea(xi;) = nm/2 — drnm 


and 
(17) o4(U) = 3 o(aij) + 2 oalrigen) + © oa(zatge) + 2 oa(X5 flu) 
or 
o(U) = nm(n + m + 1)/12 
+ nm[—d(n + m — 1) + (A — a)(m— 1) + A — @)(n — 1). 

Let the critical region under the null hypothesis consist of those U’s satisfying 
nm/2 — U > tao where lim ¢, = t. Then 
P(nm/2 — U > tac|A) = P(E,(U) — U > ko.) where k = 22 —An™ 

TA 
and by Tchebycheff’s inequality, since for large values of nym k < 0 
2 
«Soa ee 
P(nm/2 — U > tac | A) > 1 ic. GaP 

which by (5) and (17) gives 
P(nm/2 — U > tro|A) > 1 


nankn +m +1) onl —r2(n +m — 1) + (A —4)(m — 1) +A —e)(n—1) 


(tn -/nm(n + m + 1)/12 — Anm)’ 
>1 
12 


1+ stati ee m—1)+(A—a)(m—1)+(A— @)(n — 1)} 


= 12nm \ 
(: \W atm+ :) 
We obtain then that 
Lim P(nm/2 — U > tng | A) = 1 


nym—o 


which is the requirement for consistency. 





60 H. B. MANN AND D. R. WHITNEY 


6. Comparison with other tests. Another test which might seem appropriate 
for the comparison of a control group with a group receiving treatment is the 
test introduced by Wald and Wolfowitz [2]. The test by Wald and Wolfowitz is 
consistent with respect to every alternative g. However in the case considered 
we are only interested in the alternative hypothesis that measurements in the 
group receiving treatment are stochastically larger than in the control group. 
Intuitively, it seems that the test proposed here is more efficient for detecting the 
particular alternative considered than the test proposed by Wald and Wolfowitz. 
This intuitive feeling was borne out by the results of the test in the particular 
experiment described in the introduction. All in all, 62 experiments were 
conducted using various bacteria in different solutions and various amounts of 
the protective drug. The U Test gave 14 significant results on the 5% level 
and 4 on the 1% level. The test of Wald and Wolfowitz gave 7 significant 
results on the 5% level and 2 on the 1% level. A final decision between the two 
tests can, of course, only be arrived at on the basis of their power functions, 
which present formidable difficulties. 

In comparing the two statistics it was noted that a slight dislocation of a 
value may cause a significant change in the number of runs easier than it can 
cause a significant change in the statistic proposed here. For instance, in the 
sequence @XetsXutsrey i Yoysysysys both statistics would give a probability less than 
.05. If however, the sequence is slightly altered to x VorsrarsyiTeyoysyaYsYe , 
P (number of runs < 4) > .05 while P(U < 1) = .002. 

After completion of the present paper it came to the authors attention that 
the U test had already been proposed by K. K. Mathen [3]. However Mathen’s 
distribution of U is incorrect and its derivation erroneous, since it assumes 
independence of the random variables x;; as defined in section 5 of the present 
paper, while obviously 2;; and xy, are not independent. 


REFERENCES 


[1] Frank Witcoxon, ‘Individual comparisons by ranking methods’’, Biometrics Bull., 
Vol. 1 (1945), pp. 80-83. : 

[2] A. WaLp ann J. Wo.row17Tz, ‘“‘On a test whether two samples are from the same popula- 
tion’’, Annals of Math. Stat., Vol. 11 (1940), pp. 147-162. 

{3} K. K. Maruen, Sankhy@, 1946, p. 329. 





ON THE CONVERGENCE OF SEQUENCES OF MOMENT 
GENERATING FUNCTIONS 


By W. KozakrEewIcz 
University of Saskatchewan 


1. Summary. The purpose of this paper is to give a few theorems con- 
cerning the reciprocal relation between the convergence of a sequence of distribu- 
tion functions and the convergence of the corresponding sequence of their 
moment generating functions. 

The paper consists of two parts. In the first part the univariate case is 
discussed. The content of this part is closely related to that of a recent paper 
by J. H. Curtiss [1, p. 430-433], but the results are of a somewhat more general 
nature, and the methods of proofs are different and do not make use of the theory 
of a complex variable. The second part deals with the multivariate case which, 
as far as the author knows, has not been treated before with proofs in as com- 
plete and rigorous a way. 

In both the univariate and multivariate cases the proofs are based on the well 
known Helly selection principle [2, p. 26] for bounded sequences of monotonic 
functions. 

2. The univariate case. Let X be a random variable and F(z) its distribution 
function. That is, for any real x, F(x) = P{X < x}, where P{X < x} denotes 
the probability of the event X <x. The function 


eo) = Ee) = | e* aF@), 


in which the integral is taken in the Stieltjes-Riemann sense and is assumed to 
converge in some neighborhood of the origin, is called the moment generating 
function of X (or of F(zx)). 

Henceforth we use the abbreviations d.f. and m.g.f. for the terms distribution 
function and moment generating function respectively. The variable ¢ will be 
always real. 

THEOREM 1. Let {F,(x)} be a sequence of df.’s. Let M(x) for any fixed 
non-negative x be the least upper bound of the sequence {F,(—x) + 1 — F,(z)}. 
If the sequence {F,,(2)} converges on an everywhere dense set of points on the x-axis, 
and if there exists a positive number a such that for any fixed t in the interval |t| < a 
(1) lim e'''*M(x) = 0, 

z—+00 

then: 
(a) there exists a d.f. F(x) such that lim F(x) = F(x) at each point of continuity of 
of F(x); 
(b) the m.gf.’s of F(x) and F,(x), say ¢(t) and ¢,(t) exist for | t | < a; 
(c) lim ¢,(t) = g(t) for | t| < aand uniformly in each interval |t| < B < a. 

noo 

61 











62 W. KOZAKIEWICZ 


To prove (a), it may be noticed that there exists a function F(x), non-decreasing 
and continuous on the right, such that lim F,(x) = F(x) at each point of con- 


continuity of F(x). But F(x) must be a distribution function. Indeed, we 
have for x > 0 


(2) F(—2x) +1 — F(z) < M(z-). 


Now from (1), putting ¢ = 0, we find that M(x) and consequently M(xz—) 
approach zero asx—> + ©. This proves that F(—~«) = Oand F(+ 0) = 1, 
To prove (b), we notice first that the integral 


ent) = [dP y(z) (n= 1,2, ++»), 


is convergent for |¢t| <a. This follows immediately from (1) by applying the 
method of integration by parts to the integrals 


N 0 
[ é dF,(z) and [ et dF, (2), 
0 N 


which for any ¢ in the interval || < a will be seen to be bounded for all values of 
N. By the same argument, the relation lim M(x—)e'’” = 0, |t| < a, which 


z—>-+00 
can be easily deduced from (1), together with (2) imply that the integral repre- 
senting y(t) is convergent for | t| < a. 
Let now £ be a positive number less than a and let y be such that B < y < a. 
Let M, be the least upper bound of M(x)e” for x > 0. Using the method of 
integration by parts and applying (1) we have for | ¢| < 8 


+00 +00 
/ e dF, (x) = [1 — F,(N)|e%* +t / ell — F,(x)] dx 
N N 
(3) : eve) 
< M(N)e” + M16 — ' 
7-2 





We could prove easily that the same inequality is true for the integrals 
—N +00 —N 
[ earn@, | ear@, | care. 
= N lm 00 
Now let € be any positive number. Because of (3), we have 
(4) | ef dF, (x) <e, / edF(2) <«, 
|z| >No |z| >No 


for a sufficiently great number No , and uniformly with respect to n and ¢t, when 
|| <6. Clearly, No can be so chosen that F(x) is continuous for x = + N,. 
Then 


No No 
(5) lim | é'dF (x) = [  &aF(@), 
n—0 Y—No J—No 
uniformly for | ¢| < 8. 











MOMENT GENERATING FUNCTIONS 63 


The relations (4) and (5) prove that ¢,(t) — g(t) as n — «, uniformly for 
|t| <8. But can be chosen as near to a as we please; thus (c) is proved. 

THEOREM 2. Let {F,(x)} be a sequence of df.’s and {¢,(t)} the corresponding 
sequence of m.gJf.’s. If ¢,(t) exists for |t| < a, and if there exists a finite valued 
function g(t) defined for | t | < a, such that lim ¢,(t) = ¢(t) for | t| < a, then 


(a) lim M(zx)e'*!* = 0 for |t| <a; 


r—+00 
(b) there exists a df. F(x) such that lim F,(x) = F(x) at each point of continuity 


of F(x) 
(c) the m.gf. of F(x) exists for | t| < a and is identically equal to y(t) in this interval. 
(d) lim ¢,(t) = y(t) uniformly in each interval | t | < B < a. 


To prove (a), let ¢ be a number in the interval |¢| < a, and let 8 be chosen so 
that |t| <8< a. Then, for x > 0, we have 


F(—2)+1— Fala) = [aku + [aut 


—2 +00 
< ot I eo dF(u) + o* / ef dF,(u) 


< &**[en(—B) + ¢n(6)]. 
Consequently 


M(zx)e'*'* < eo l.u.b. {gn(—8) + ¢n(B)}, 


and since the sequences {g,(—@)} and {¢g,(8)} are convergent, and therefore 
bounded, it follows that M (x)e'''* approaches zero as —> + o. 

To prove (b) we may notice that by the Helly selection principle we can 
choose a subsequence {F,,(x)} which is convergent to some non-decreasing 
function F(x), at each point of continuity of F(x). Now the Theorem 1 together 
with (a) imply that F(x) is a d.f. and that the limit of the subsequence {¢,, (¢)}, 
namely ¢(), must be identical, for |t| < a, with the m.gf. of F(x). By the 
uniqueness property of a m.g.f. we know that F(x) is uniquely determined by 
g(t), and therefore it follows that every convergent subsequence of {F,(zx)} 
approaches the same limit F(x) at each point of continuity of F(x). This is, 
however, equivalent to the statement that the sequence {F,(x)} itself converges 
to F(x) at each point of continuity of F(x). Thus (b) is proved. We see at 
once that (c) and (d) follow immediately from the Theorem 1. 

Theorem 2 is of course similar to the Theorem 3 in the paper of Curtiss [1, 
p. 432]. The proof of (a), however, is not contained in his paper. From the 
Theorems 1 and 2 there follows immediately 
THEorEM 3. Let {F,,(x)} be a sequence of d.f.’s, and let {g,(t)} be the correspond- 








64 W. KOZAKIEWICZ 


ing sequence of m.g.f.’s, which are all assumed to exist for |t| <a. The necessary 

and sufficient conditions for the convergence of {¢n(t)} in the interval | t| < a, are: 

(a) lim M(z)e'* = 0, |t|<e 
n—+00 

(b) the sequence {F,(x)} converges toa d.f. F(x) at each point of continuity of F(x). 

Further, the m.g.f. of F(a) exists for |t| < a and ts equal in this interval to the limit 

of the sequence {¢,(t)}. 

In his paper Curtiss gives an example of a sequence {F,(x)} of d.f.’s which 
converges to a d.f. F(x), while the corresponding sequence {¢,(t)} of m.g.f.’s does 
not converge to the m.g.f. g(t) of the d.f. F(x), though both ¢,(t), (n = 1, 2, ---), 
and ¢(¢) exist for all ¢. It may be easily proved by the direct method that in the 
case considered the condition (a) of the Theorem 3 is not satisfied. 

It is perhaps worth while to notice that the condition (a) of the Theorem 3 may 
be expressed also as follows: 

lim x log M(x) < —a. 
z—+00 

3. The multivariate case. For the sake of simplicity we shall consider here 
the bivariate case only. The results obtained in this chapter, can be, however, 
easily extended to the case when d.f.’s and m.g.f.’s are defined in the Euclidean 
space of any finite number of dimensions. 

Let (X; , X2) be a random vector variable in the two-dimensional Euclidean 
space, and let F(x, x2) be its df. That is, for any real numbers 2; and 22, 


F (a1 , %2) = P{X, <m,X2< Xo}. 
Let 


Fim) = P{Xi < m} = F(m,+ &), 
F(a) = P{X2 < Xo} F(+ &, 22); 


then F;(x;) and F2(x2) are called the marginal d-f.’s of X; and X2 respectively. 
The m.gf.’s of the d-f.’s F(x , x2), F(a) and F2(x2) are defined by the equations: 


+00 +00 
ots, &) = E(t = ff ete are, x) 


gi(t:)) = E(e***) = [ e"*** dF; (2), (¢ = 1, 2), 


in which the integrals are assumed to converge in some neighborhood of the 
origin. It is easy to see that gi(t:) = o(t: , 0) and g(t) = (0, &). 

THEOREM 4. Let (ti, &) and o*(t, tz) be the mg f.’s of df.’s F(a1, x2) and 
F*(a,, %2) respectively. If o(t,, te) and o*(t,, t&) exist and are equal in some 
neighborhood of the origin |ti| < a;, (@ = 1, 2), then F(a, x2) = F*(x1, 2) 
identically. 

To prove this theorem, let us introduce two random vector variables (X, , X2) 





MOMENT GENERATING FUNCTIONS 65 


‘and (Xt , X2) of which the d-f.’s are respectively F and F*. Consider now two 


random variables 
Z=Xt+Xh, %Z= Xih+ Xb, 


where ¢, and f, denote two real numbers not both zero. If g(t) and ¢*(é) are 
respectively the m.g.f.’s of Z and Z*, we have 


o(t) = ot, th), o*(t) = o* (th, th). 


Consequently g(t) = ¢*(t) provided that | tt;| < a;,(¢ = 1,2). It follows from 
the uniqueness property of the m.g.f. in the univariate case that the d.f.’s of 
Z and Z* must be identical. Now, according to a theorem due to Cramér 
[3, p. 105], if the d-f.’s of Z and Z* coincide for all pairs of values (é, , &) such that 
|t:| + | &| #0, the df.’s F and F* must be identical. It may be worth while to 
reproduce here Cramér’s proof. Let y(t, &) = E(e“*'****) and y*(h, &) = 
E(e*i""**3) be the characteristic functions of F and F* respectively. 
Then y(t, t) and y*(tt,, tt2) are the characteristic functions of Z and Z* 
respectively. Since Z and Z* have the same d.f.’s, it follows that (tt, th) = 
V* (tt, , tt) for all values of t. Putting ¢ = 1, we find that ¥(4, &) = y*(t, &) 
ifj|+ |4|4#0. Forh==0,yY(0,0) = y¥*(0,0) =1. Therefore y(t, &) = 
y*(t, , &) identically, and since the characteristic function uniquely determines 
the d.f., it follows that the d.f. F and F* are identical. 

THEOREM 5. Let {F (x1, 2%2)} be a sequence of d.f.’s. Let Fin(a1) and Fon(x2) 
be respectively the marginal d.f.’s determined by F,,(a; , 22). Let 


M(x) = lub. {Fin(—2z) +1 —- Fin(z5) } 


where x; > 0, (¢ = 1, 2). If there exist positive numbers a, and az such that for 

| t; | < ay 

(6) lim M,(z,)e'*'** = 0, (i = 1, 2), 
zi—>+00 

and if {F,(x:, 22)} converges on an everywhere dense set on the (x1, 2) plane; 

then: 


(a) there exists a d.f. F(a: , 2) such that lim F(a, t2) = F(a, t2) at each point 


of continuity of F(x: , x2), 

(b) there exist two positive numbers 6, and 52, 6; < a;, such that the m.gf.’s of 
F(a, , x2) and F(x; , %2), say o(th, t&) and gn(t, &), exist for | t;| < 6;, (¢ = 1, 2), 
(c) lim gal(ti, &) = (ti, tb) for | ti | < 6;, and uniformly in each two-dimensional 


interval | t;| < Bi < 6;, (¢ = 1, 2). 
To prove (a), we notice that there obviously exists a function F(21, x2), con- 
tinous on the right with respect to each variable, satisfying the relation 


A’F (x1, t2) = F(a , 22) + Fai, m2) — F(ai, m) — F(a: , x) > 0 


, “7 , ur 
for 21 < 2% ,2%2 < 2% , and such that 











66 W. KOZAKIEWICZ 
(7) lim F, (2 ’ X2) = F(x, ’ Xo) 
at each point of continuity of F(a, x2). We shall prove that F(x , x2) is a df, 
In fact, it is easy to see that we have for x; > 0, (¢ = 1, 2), 
(8) F(—1 , —%) < F(—%,%) < Mi(m—), Fla, —m) < M2(m—), 

1 — F(t »%) < Mi(m) + M2(22). 
Now, according to (6), lim M,(x;—) = lim M,(x,) = 0, (¢ = 1, 2), therefore it 

24-0 zi—>+00 


follows from (8) that F(—«, —«©) = F(—«, m) = F(a, —«©) = O and 
F(+0,-+«) = 1, which proves that F(x , 22) isadf. 

To prove (b), let gin(t;) be the m.gf. of the df. Fin(xi), (¢ = 1, 2). Let 
F(x) and F,(22) be the marginal d.f.’s determined by F(a , x2) and let ¢;(¢;) be 
the m.g.f. of F;(x;), (¢ = 1, 2). 

Now let N’ > N > Oand 


N’ in N N 
RAN, 4,0)=[ [are a) — ff et are, a) 
—-N? J—N? Lait dae 


“C1011 +8, £, 


* oe 
+ [ [ ett dF, 0) =Lht+h+ik+h. 
— Ne — NN’ 


Applying the Schwartz inequality to J, , we find 


Nn’ N + Nn’ N : 
(9) h< ( / | gait a.) ( / | epaata iP). 
N -N* Se 


But 
N’ N N’ 

(10) / | em dF (x1, Le) 4 / eu dF ,,(21), 
n nN’ N 

and similarly 
we N N 

(11) [ [et arcen,m) < [ee are(as). 
N JN’ -N? 


Let ¢ be any positive number and 7; a positive number less than a; , (¢ = 1, 2). 
It follows from the proof of the Theorem 1, taking into account (6), that the 
integrals representing ¢in(¢;) and ¢,(t;), (¢ = 1, 2), exist and are uniformly con- 
vergent with respect to n and t;, when | t;| < y:, (¢ = 1, 2). Consequently 
we have 


(12) / 8 dFin(ts) < 6 / edR(a) <e  (=1,2), 
[zg] >N [zg] >N 


uniformly with respect to n and ¢; when | ¢| < y;, (¢ = 1, 2), provided that N is 
sufficiently large, say N > Ny. Letustake 8; = y;/2,(i = 1,2). The integrals 


re 


St 


- —~ & 


_ sae Co 2 




















MOMENT GENERATING FUNCTIONS 67 


“th 

representing ¢in(t;) and ¢,(t;), (¢ = 1, 2), are obviously uniformly bounded for all 
n and when | ¢;| < y:, (¢ = 1, 2), they are all less than some constant C. Con- 
sequently taking into account (9), (10), (11), and (12), we find 


I; < V/Ce, 


uniformly with respect to n and ¢; when | ¢; | < 8;, (¢ = 1, 2), provided that 
N'’>N2WNo. Since the same inequality is true for J; , Jz and I, , we have 


(13) R,(N, N’, th, b) < 4 Ce, 


uniformly with respect to n and ¢;, when | ¢; | < 6;, (¢ = 1, 2), provided N’ > 
N >No. Hence the integral representing ¢,(t: , &) is uniformly convergent for 
| t; | < B;, and consequently convergent for | t; | < a;/2, (¢ = 1, 2), since 6; 
can be chosen as near to a;/2 as we please. 

Similarly, using (12), we could find 


(14) R(N,N’,t§,t) <4V7Ce, |t|<Bi,N’>ND>No 


where 


N’ N’ N N 
RW,N' 4) =f fe ear@, a) — [fo eara, a). 
ind N’ N N 


This proves, in turn, that the integral representing g(t; , &) is uniformly con- 
vergent for | ¢; | < 8; and convergent for | t;| < a;/2, (¢ = 1,2). Thus (b) is 
proved with 6; = a;/2, (¢ = 1, 2). 

To prove (c), let N’ > + © and N = Npoin (13) and (14). We obtain 


(15) R,(No, + ©, tr, ta) < 4/Ce, R(No, + o, t,t) < 4/Ce 


uniformly with respect to n and ¢; when | t; |<B;. 
Clearly, No can be chosen so that Fi(x:) and F.(x2) are continuous for 2, = 
t%2 = +No. Then 


eh at Oe ie 
(16) lim [ [ enitat gp (x a) = [ [ etal P(g, as), 
n—>2o No %No No No 


uniformly for | ¢; | < 6; , (¢ = 1, 2). 

The relations (15) and (16) prove that 

lim ¢n(i, te) = g(t, &), 

uniformly for | ¢;| < 8,, (¢ = 1,2). The ordinary convergence obviously holds 
for | t; | <q a;/2, (7 = a 2). 

It follows from the above proof, which refers to the bivariate case, that we 
may take 6; = a,/2, (¢ = 1, 2), in (b) and (c). 

The existence of the corresponding numbers 6;, 6; < a;, (¢ = 1, 2,--- ,k), 
in the k-variate case can be easily established by the repeated application of the 
Schwartz inequality. 


68 W. KOZAKIEWICZ 


THEOREM 6. Let gn(ti , t2), gin(ti), Fn(ti, X2), Fin(ai) and M,(x;), (¢ = 1, 2), 
have the same meaning as in the Theorem 5. If gn(ti, b) exist for | ti | < ai, 
(i = 1, 2), and tf there exists a finite valued function g(t: , t2) defined for | t:| < a; , 
such that lim Pn(hi ? ty) — elt ? br), | ti | < ay ? 


then 
(a) lim Ma, e''** =0 for |ti| <a, 


2g—>+00 


(b) there exists a d.f. F(x: , x2), such that lim F(a, 22) = F(a , x2) at each point of 


no 


continuity of F(x , x2), 
(c) the m.gf. of F(x1, x2) exists for | t;| < a; and is identically equal to v(t, t) for 
| t; | <a, = 1, 2), 


(d) lim gn(t,&) = g(t, &) uniformly for | t:| < Bi < a¢, (@ = 1, 2). 


To prove (a), it is sufficient to notice that gin(t:) = galt, 0) and gon(t) = 
¢n(0, &). Consequently we have 


lim gin(ft) = g(t, 0), lim gan(h) = 90, &), | ts | <a, (@ = 1,2). 


Therefore (a) follows immediately from Theorem 2. 

To prove (b), we may notice that according to the Helly principle of selection 
applied to the sequence {F,,(%1, 22)}, there exists a subsequence {F,,(21, 22)}, 
selected from the sequence {F,,(2: , 22)} which is convergent to some function 
F(2,, 22) continuous on the right and with non-negative second difference. 
But F (2; , 22) must be a d.f. according to the Theorem 5, since the relation (6) is 
satisfied by the sequence {F,,,(x1, 22)}. Moreover, the limit of the sequence 
{On,(t1 , 2) }, namely y(t, , &), when considered in a sufficiently small neighborhood 
of the origin, is the m.g.f. of F(a , 22). Since the d.f. is uniquely determined by 
its m.g.f., it follows that every convergent subsequence of {F,(x%, x2)} con- 
verges to the same limit F(a , x2) at each point of continuity of F(x; , 22). This 
is, however, the same as to say that the sequence {F,,(x: , x2) } itself converges to 
F(a, , 22) at each point of continuity of F(x; , x2). 

To prove (c), we have to show that the m.g.f. of F(a, , x2), say ¢*(t , &), exists 
for | t;| < a; and is equal to y(t, &), | ti| << ai:,(¢ = 1,2). (We have proved that 
¢*(t:, &) = o(t, &) only for sufficiently small values of | t; | and | % |). The 
existence of y*(t, , &) for |t;| < ai, (¢ = 1, 2), can be easily established by the 
method used by Curtiss [1, p. 433]. Suppose indeed that y*(4, &) does not 
exist at some point (é{ , 2), where | | < a, (¢ = 1,2). That means that we 
can find a positive number N such that 


N N 
(17) | : [ ett dn, a1) > offi, 8) 


Since lim F(x, x2) = F(x, 2) at all points of continuity of F(a, x2), and since 


no 





MOMENT GENERATING FUNCTIONS 69 


“N can be so chosen that the marginal d.f.’s F(a) and F,(z2) are continuous for 
t = t = +N, it follows that 


: . - t9r, +29 ee +8 
(18) lim | ‘ e°78 "278 GF. (21, 22) = [ [ e'171'372 dF(x,, 22). 
no J—N om J —N mH 


The formulas (17) and (18) give lim ¢,(1, &) > o(t, &), which is impossible 
because lim ga(ti , &) = g(t, &) for | t;| < a:, (¢ = 1, 2). 


To prove that g(t: , &) = o*(t, &) for | t;| < ai, (¢ = 1, 2), let (4, t) denote 
a fixed point such that | ¢;| < ai, (¢ = 1,2). Clearly, ¢, (tt, te), (n = 1,2, ---), 
and ¢*(tt; , tt), considered as functions of the variable ¢, are m.g.f.’s provided that 
| tt;| < a;, (@ = 1,2). (See first part of proof of Theorem 4). Now, according 
to Theorem 2, the limit of the sequence {¢,(¢t: , tt2)}, namely g(tt , tte), | tti| < ai, 
(¢ = 1, 2),isalsoam.gf. Since g(tt, te) = ¢*(tt, dé) in a sufficiently small 
interval containing the point ¢ = 0, it follows from the uniqueness property of the 
m.g.f. in the univariate case that (tt, tt.) = y*(th , th) identically for | tt; | < a, 
(2 = 3. 2). Putting ¢ = ‘, we find g(t, te) = ¢*(t, te), | t; | < Qi, (¢ = i 2). 
Thus (c) is completely proved. 

To prove (d), it is sufficient to notice that the sequence {¢,(é , &)} is uniformly 
continuous in each two-dimensional interval | ¢; | < B; < a:, (¢ = 1, 2), (that 
is, for any € > 0, there exists a positive number 6 = 4(€) such that 


| onli, &) — galt ,&) | <e 


lé-W | <8 1/61 <8, |&|<8,@=1,2), (m=1,2,---)). 


Consequently, the sequence {¢n(t , &)} which is convergent for | ¢; | < 6; , must 
be uniformly convergent if | ¢;| < 6;, (¢ = 1, 2). 


REFERENCES 


[1] J. H. Curtiss, “‘A note on the theory of moment generating functions”, Annals of 
Math. Stat., Vol. 13 (1942). 

[2] D. V. WippEer, The Laplace Transform, Princeton Univ. Press, 1941. 

[3] H. Cramér, Random Variables and Prebability Distributions, Cambridge Tract No. 
36, 1937. 


~- 
wtetwete aoe oe 











A GENERALIZATION OF TSHEBYSHEV’S INEQUALITY TO TWO 
DIMENSIONS 


By Z. W. Brrnspaum, J. RayMonp, anp H. 8S. ZucKERMAN 
University of Washington 
1. Let X:, X2,--- ,X, be independent random variables with expectations 


E(X;) = e; and variances o (Xj) = o; for j = 1, 2, --- ,. The question 
2 (X, — e,)? 
may be asked: What is the upper bound for the probability P ya = 1) 
j=l i 
that the point (X;, X2,---,Xn,) does not fall inside of the ellipsoid 
n as a se 
(Hi oar == 1? 
j=l tj 
For n = 1 the answer to this question is given by Tshebyshev’s inequality 
a 2 2 
(1.1) pf X— Fey >i) << 


which can not be improved without further assumptions. By a trivial generali- 
zation of the argument leading to (1.1) one can prove the inequality 
~(X; — e)? =. 03 
(1.2) Ao S* zs h2 
j=l tj j= Cj 
for any integer n. This inequality, however, can be improved for n > 2. In 
particular, for n = 2, the following theorem will be proved: 
THEOREM 1.1. Let X and Y be independent random variables, with expectations 
E(X) = Xo, E(Y) = Yo and variances ox , oy . Then, for anys >0,t>0 
2 2 


such that 5 < we have 








=F 
_ 2 a 2 
iia p[ X= Ho" 4 C= 3 1) cra 
where 
2 2 
1 m4 7 >1 
on é 
ok poh ok | (E+ #) . 
$? - rs i ox 
(1.4) L(s, t) = ‘g2 
ee te ox 2o7 oO; 4a} 
one -_— & | < 4f— —= me 
+ <1 <4 (S 4 % x 4 Sr) 
2 2 2 2 2 2 —————— 
Ox Gy _- FxOy op 1 ( Ox 2oy Ox 4oy 
2s Ber ay (SE + 2 + 4 St) <a. 








TSHEBYSHEV’S INEQUALITY 71 


‘ 2 2 
For any given ox, dy, > 0,t > 0 such that = < + there exist independent random 


variables X and Y with the variances ox , oy , such that the equality sign is true in 
(1.8). 

This theorem is a special case of the more general statement: 

THEOREM 1.2. Let W, Z be independent random variables such that 


(1.4) P(W <0) = P(Z < 0) = 0, 
(1.5) E(W) =), E(Z) = u, 

(1.6) A <p. 

Then, for any t > 0, we have 

(1.7) PW+Z>t< Mi 
where 


(1 if t<v+up 
A+u A t—-Atana)_ -«z 


18) M(t) = an 
( if Atu<st<4Q4+ 24+ VX + 42) 


ct. y +e + TD oe 


t e 

For any given > 0, un > 0, < pw, andt > O, there exist independent variables 

W, Z such that (1.4) and (1.5) are fulfilled and that the equality sign is true in (1.7). 
Theorem 1.1 is obtained from Theorem 1.2 by writing 


== Xo)" Y= 
? ; . 





Ww 


2. Before proving Theorem 1.2 we shall derive two lemmas. The first of these 
lemmas deals with more than one variable. Since its proof for general m does not 
present any additional difficulties it will be stated and proven for any number 
m > 1 of variables, although in the proof of Theorem 1.2 it will be used only 
form = 1. 

Lemma 1. Let U, Vi, V2,-+- , Vm be independent discrete random variables 
with only non-negative possible values, and let U have a probability distribution 
with the possible values 0 < U; < U2 < --- < U, and the probabilities P(U;) = 1; 
fori = 1,2,---,n. We consider any three possible values U;, U;, Ui of U such 
that 


0<U;<U.< Ui, 
with the corresponding probabilities r;, Tx ,7;. Then, for any t > 0, there exists a 


random variable U' with the same distribution as U except that the probabilities 
13,7, 1. of U;, Ux, Ur are replaced by 1; , r;, , 7: such that 











72 Z. W. BIRNBAUM, J. RAYMOND, AND H. S. ZUCKERMAN 


(2.1) E(U’) = E(U) 
(2.2) one of r; ; Tk 5 r, 1s zero 
(23) PU’ +Vit+--- + Vn 2) >P(U+Vi4+ --- + Vu 2 2). 
Proor: let r; , 7 , 7; be written 
(2.4) rj = 77+ a8, % = re — Br = 1 + (1 — adB. 
For any a, 8 we then have 
rrtmtr =rjtnt+n. 
Choosing 
(2.5) a = (U; — Ux)/(Ui — U5) 
we obtain the equality 
Us; + Ung + Uri = Ug + Urns + Urn 


so that (2.1) is true for any 8. 
We obviously have 


P(u +3 ¥.> ) «Sp U.)-P( 


(2.6) = 


M: 


V.> 1-0) 


I 
~ 


8 


= nP(S V.>t- U.). 


t=1 s=] 


3 


The variable U’ has the same possible values U; as the variable U. Writing 
P(U’ = U; = 17;, fori = 1, 2, --- , n, we also have 


(2.7) P(u+dv2 = DxP(Xv.21- u). 
t=1 s=] 


s=l1 


From (2.6), (2.7), and (2.4) we obtain 


p(u +d v.21)-P(u+dv.20) 


(2.8) = aBP (> V.2t- U;) ~<a (= V.2t— Us) 
s= s=l 


+a- a)pP (3 ¥, +e v1). 


For a determined by (2.5), the right-hand side of (2.8) is of the form C8, and 
will be positive if sign 8 = sign C. If sign C is positive, we choose 8B = 7, and 
have, from (2.4), r = 0, and, from (2.8), the inequality (2.3). If sign C is 


. Tr; r 
negative, we set @ = Max ( —+, -—— 
a 





) which leads to either r; = O or r; = 


0, and again to (2.3). In both cases we have kept the probabilities r;, ri, 71 
non-negative as they should be. 


[Se wo —_-™ ?@ 


~~ 










TSHEBYSHEV’S INEQUALITY 73 





Lemma 2. Let the discrete random variable U have only the two non-negative 
values U1 < U2, with the corresponding probabilities r; , r2 , and let t be a given 
number such that 


(2.9) E(U) <t < Un. 


Then there exists a number a > 0 such that the random variable U' with the possible 
values 


(2.91) Ui=Ui+a 
Uz =t 
and the corresponding probabilities r; , rz , has the properties 
(2.92) 0<U, < Uz 
(2.93) E(U’) = E(U). 
Proor: to have (2.91) and (2.93) it is sufficient to choose 
i T2(U, — t) 
T1 


Then (2.92) is also fulfilled since, in view of (2.9), we have 
‘1 OF -t- T2 Us = Tot = E(U) ~_ Tot < t— Tot 
T1 T1 a 1 
and obviously a > 0 and hence U; > Ui > 0. 


3. Theorem 1 will first be proven under the assumption that W and Z are 
discrete random variables, each with a finite number of non-negative possible 
values. By repeatedly applying Lemma 1 with m = 1, U = W, Vi = Z, we 
reduce the number of possible values of W which have non-zero probabilities 
to two, and denote those possible values by Wi < W:2, and their probabilities 
by pi and pp = 1 — pi. Then, applying Lemma 1 to the case m = 1, U = Z, 
V, = W, we similarly reduce the possible values of Z to the two non-negative 
values Z; < Z2 , and denote the corresponding probabilities by g; and gz = 1 — q. 
Throughout all these steps the expectations E(W) = dX and E(Z) = yw remain 
unchanged, and P(W + Z > ?) is not decreased. 

For t < \ + u, inequality (1.3) is obviously true, and equality is attained for 
W having the only possible value \ with probability 1 and Z having the only 
possible value » with probability 1. 

For the remainder of the proof we assume t > } + uw. We then have 


t>rAt+weraAta>wWitZ. 


If W. > t, we may replace it by W2 = ¢ according to Lemma 2. Similarly, if 
Z, > t, we may replace it by Z. = t. The probability P(W + Z > 2) is not 
decreased in this process. We may thus assume, without loss of generality, that 


We St, Ze St. 





| 
{ 
} 











74 Z. W. BIRNBAUM, J. RAYMOND, AND H. S. ZUCKERMAN 


The joint distribution of (W, Z) has now the possible values represented by the 
four points (Wi, 9 Zi), (Wi, ) Z2), (W2 ; Zi), (W2 ) Ze). The coordinates of these 
four points and their probabilities fulfill the following conditions 


(3.1) 0< Wi<ASW2 St; 0O<%Z4S54sa<St 
(3.2) Athe=atqe=l 
(3.3) piWi + poWe = d, MZ1 + @Z2 = yb. 


In view of (3.1), the point (W1 , Z:) always lies below the line W + Z =t. The 
other points may or may not lie below that line. Accordingly, we distinguish 
the cases listed in Table I. These clearly include all possible cases since (W2 , Z2) 
can not be below the line W + Z = ¢ without all the other points being below 
that line. 

In case V we have P(W + Z > t) = 0. 

For the discussion of the remaining cases we note the following relationships 
which follow from (3.2) and (3.3). 








TABLE I 
Points below line | Points not below line 
Case 
W+Z=t W+Z=t 

I (Wi ? Z1) | (W2 . Z:); (Wi ? 22), (We ’ 22) 
II (Wi, 2:1), (We, Z1) | (Wi, Z2), (We, Zs) 
Ill (Wi, Z:), (W: ,Z2) (We, Z:), (We, Ze) 

IV (Wi, Z:), (We, Z:1), (Wi, Ze) (W2, Ze) 

V (Wi ’ Zi); (W2 ’ Z:); (Wi ’ Z2), (W:, Z2) none 





” We-A = A — WwW, 
Pr Vo — W,’ ” W2 — W,’ 
Z2— —Z 
qa = 2 B _ Fb 1 








In case I we have 
(3.41) WMt+2a<t, Wet%a2>t Witwa2at Wetaat, 


P=PW+2Z228 = pat pe + pe =1— pi 
ee ee hed 
W2 — W, —-Z 


Since P is a decreasing function of W; and Z,, we replace Wi, and Z, by the 
smallest values compatible with (3.41), namely Wi = t — 22,2, = t — Wa, 
and obtain 


_ (Wi = AZo — w) _ 
‘aie. 





oO me 





TSHEBYSHEV’S INEQUALITY 75 


‘For fixed Z., R(W2, Z:) has a minimum at W. = Z. + 2 — ¢ and no other 
extremum, hence it assumes its maximum at one or both of the end-points of the 
interval for We which, by (3.1) and (3.41), is 


t—2Z,< We<t. 
In view of (3.1) we also have t — w < t — Z,, and hence 
P < Max [R(t — pn, Z), R(t, Z-)). 
We find 


t—u—r t—pu—r r 
RG — 2,27.) = l= ———.. ¢ J =~ ...... = 
( u, Z2) <<. 7 Fo a i= i 





and 


R(t; Zs) aoit~ (t ie _ a 7) si R (Z,). 
Ze 


This last expression has a minimum for Z, = 2 and no other extremum, hence it 
assumes its maximum at the ends of the interval for Z, which, by (3.41) and 
(3.1), is 


t—-Wi<Z.<t. 
From (3.1) we also have t — A < ¢ — Wi; and thus 








RU, W) < Max (R(t — 2), RO] = Max| -#., Ate_ dl, 


Finally, we obtain 











d u A+ | 
< X =— —- 7 
P <Max| >, es ; be 
Each of the values P = a . a ; Att ~ * can be attained in case I, 
se Le — 


as is shown by the probability distributions 
WwW: = 0, We=t-—4uh, Z: = p, Ze = t, 
(3.42) : x x 





: qa = 1, gz = 0; 


W, = r, W. = é, Zi _ 0, Zo =t- A, 
(3.43) 





(3.44) 















76 Z. W. BIRNBAUM, J. RAYMOND, AND H. 8S. ZUCKERMAN 


In case II we have 


Witu<t Wtua<t, Wit+ 2 >t, We+ Z2 >t, 


(3.51) p—- Z 
P=PW+Z>2) = Pie + Pate = 2 = 7 7 


This is a decreasing function of Z; as well as of Z. and hence takes its maximum 
for the smallest values of Z; and Z, compatible with (3.1) and (3.5), that is for 
Z,= 0,2, =t—. We thus obtain 





oo 
PSI 





This upper bound can be attained in case II, as may be seen from the distribution 
Wi =i, W:2=A, Z, = 0, Zz=t—, 
(3.52) 


fe a 


=—i1 = == _- = 
P1 2) P2 3. q 1 eo i’ q2 £ = % 


Case III is symmetrical with case IT and leads to the inequality 


~t—p 
In case IV we have 
Wit <t, We+ Zi < t, Wi+ Z < t, W.+ Z:2 >t, 


(3.61) = i. _ A-— Wie — 4%) 
Pare +228 = ee ~ eo 8 


The right hand side is a decreasing function of each of the variables Wi, We, 
Z,, Zz, and hence is increased by chosing for these variables the smallest values 
compatible with (3.61), i.e. 


(3.62) W, — Zi = 0, We a+ Zo =f 
for which we obtain 


r Me 
ee 
—~ Wet — We 





= R®(W.). 


Since R®(W:) has a minimum at W, = 5 and no other extremum, it attains its 


largest value at one of the end points of the interval for W, which, by (8.1), 
(3.61) and (3.62), is 


A<W.z<t—u. 
This leads to 


P< Max(R®(), R(t — »)] = Max F ‘j. | 


r 


we 


TSHEBYSHEV’S INEQUALITY 77 


Le 


“The upper bounds as oS , respectively, are attained in case IV for the 


A’ t 
probability distribution 
Wi = 0, W: =), Z, = 0, Z2,=t-—, 


(3.63) 
ri = 0, Pe = 1, e=-1- =, q = =. 


and 
W, = 0, W.2=t—4u, Z, = 0, Z2 = b, 


r r 
ee eS Pe 
alll 





” Fon q = 0, @ = 1. 


From the preceding discussion we conclude that P = P(W + Z > #) always 
fulfills the inequality 


-|_A i A+B_ AB] _ 
P< Max| >, we “| = vw 








< —" _ for 


fort > + uw. Since we have assumed A < uy, we have ; A $e; 
= - a 


t>X-+ yu, and therefore 











_ ML A+u_ Me 
UW) = Max| 4, — | for t>A+ uh. 


It is easily verified that 


i. rem for A+ u<St<f(QAt%&+ VY + 4) 


be 


lA 


and 


f, <.e+-™ for (A+ 2+ V/¥ +42) <t 





so that we have U(t) = M(t) as defined in (1.8). For given A, u and any ¢t > A 
+ yu, the equality P = os is fulfilled for the distributions (3.48), (3.52) and 


(3.63), while the equality P = A+ _ ™ is true for the distribution (3.44). 
This completes the proof of Theorem 1.2 for discrete random variables. If 
W and Z are independent random variables with the cumulative probability 
functions P(W < w) = F(w) and P(Z < z) = G(z), then each of these cumulative 
probability functions can be uniformly approximated by a step function with a 
finite number of steps, that is by the cumulative probability function of a discrete 
random variable with a finite number of possible values. Since for such variables 
Theorem 1.2 is proven, it also is true for the general random variables W and Z. 











78 Z. W. BIRNBAUM, J. RAYMOND, AND H. S. ZUCKERMAN 


4. An attempt to extend the method used in proving Theorem 1.2 to more 
than two variables leads to arguments of a prohibitive length. It is possible, 
however, to obtain corollaries of Theorems 1.1 and 1.2 which lead to an improve- 
ment of inequality (1.2) for n variables. 

Coroutuary 2.1 Let X;, X2,-+-,Xn be independent random variables with 
expectations E(X;) = e; and variances o (Xj) = o;. Then, for any t; > 0, 
j = 1,2, +--+ ,n, and any m such that 


™ 2 n 2 
* «2 re 
a “- 2 — m2, 
j=1 t; j=m+1 l; 


we have the inequality 





IS° oi y §— (21+ 2) 
n 5 aay ae ae oe ae 
p(y Se > 1)s 1 i; 1 i Z 
I | . as 
if Ste <t<h[2it+222.+ Vs? + 453] 
n 2 
[Desa it FLO + VSP Fae Se 
= SF 


This corollary is a special case of the following corollary to Theorem 1.2 
Coro.tuary 2.2. Let Wi, We,---,Wn be independent random variables 
such that P(W; < 0) = Oforj = 1, 2, --- , n, and let m be any integer such that 


DEW)=r DL EW)=4 ASu. 
t 


Then, for any t > 0, we have 
P (> W;> ) < MW 
j=1 


where M(t) is defined by (1.8). 
This corollary follows immediately from Theorem 1.2 by writing 


W=)>W;, Z= Dd W;. 
j=l j=m+1 
To obtain Corollary 2.1, one only has to write in Corollary 2.2 


_ (Xi — @;)" 


W; * 
i f 


If some additional assumptions are made on the expectations E(W ;) or on the 





} 





TSHEBYSHEV’S INEQUALITY 79 


"variances o; , the upper bounds in Corollaries 2.1 and 2.2 may be minimized by 


proper choice of m or of the ¢;. For example, if all the variances are equal 


2 2 2 2 
a -—- Ge +++ = ee = 


and n is even, one obtains the inequality 


: a . 
Pd (X;- e;) > é| < | 2 as a7 if No SBS —— or se no 


j=l 














DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 
IN A CIRCULARLY CORRELATED UNIVERSE!’ 


By R. B. Lerenrk 


Cowles Commission for Research in Economics 


1. Summary. It is desired to find an approximate distribution of simple 


form for the statistic 7 = — ; paca _- (Fis an estimate of the serial corre- 
tbs + Xr 


lation coefficient p in a circular universe) in the case that p ~ 0 in the universe. 
Such a distribution is obtained by smoothing the joint characteristic function 
of the numerator and denominator of the expression for 7. The first two mo- 
ments are calculated; from these 7 is seen to be a consistent estimate of p. A 
graph of this distribution for sample size 7’ = 20 and various values of p is given. 

In addition, an approximate distribution for p = 2; + --- + 27 is derived 
which reduces to the exact (x’-) distribution if p = 0. From a formula which 
yields all moments, it is concluded that, at least up to the degree of approxima- 
tion attained, p/T is an unbiased and consistent extimate of o’. 


2. Several writers have investigated the temporally homogeneous stochastic 
process defined by 


(1) Le — PXe-1 = 2, t= 1,2,---,T, |p| <1, 


where the z, are unobservable disturbances, normally and independently dis- 
tributed with mean zero and variance o’, the x, are observed variates, and the 
“first observation” x) has a normal distribution with mean zero and such a 
variance o, that all later observations have the same variance. Thus we have 


9 


2 — o 
(2) ol 
and the joint distribution of a sample of 7’ + 1 successive values is 
(1 — p*)! 1 - 
g(%0,%1, +++, %r) = Dare * EXP] —55 {to +27 
(3) (210 ) 20” 


~ 2p(xor1 + +++ + 27-127) + (1 + p°)(xi +--- +24) |, 


Koopmans ([{1], formula 96), by smoothing characteristic values, has obtained 
an approximation to the distribution of the serial correlation coefficient r for the 
case p = 0, where 

Xox eee +2712 
(4) — ae >..-- > Sea 
Zot ess + Xr 


1 Cowles Commission Papers, New Series, No. 21. 
80 











SERIAL CORRELATION COEFFICIENT 81 


This result is expressed in the form of a definite integral whose evaluation 
has not so far been effected. 

By considering the related circular stochastic process, where 2 is defined to 
be the same observation as x7, great simplification is obtained. Here the 
joint distribution of 7 , x2, +--+ ,Zris 


_ Alp) _ 1 
f(t1,%2,°°*, 27) = aon | a — 
(5) {(1 + p')(ai +--+ + 27) — 2p(aiae + +++ + 2) | 
Ne) = qo 


By smoothing characteristic values, Koopmans ([1], formula 92) found a definite 
integral and Dixon (([2], 3.22) an explicit expression for an approximate distribu- 
tion of the circular serial correlation coefficient 7, for the case p = 0, where 


(6) pm Tite ts + Ark 
G+ -- +2 
Dixon’s distribution Ro(7) has the simple form 
r+) 
(7) Ro (F) = marae aes (Qa — Fr, 
1 “ae = 
rar (7 +3) 


Rubin [3] proved these results to be equivalent. On the other hand, R. L. 
Anderson [4] obtained the exact distribution of 7 in the case p = 0. Madow [5] 
extended this result to the case p ~ 0, using a property of sufficient statistics 
also noted by Koopmans ([1], p. 17) in connection with the non-circular problem. 

It would, however, be difficult to find percentile points or moments from 
Madow’s exact distribution. An approximate distribution of 7 for p ¥ 0, 
together with its moments, analogous to Dixon-Koopmans’ for p = 0, should 
therefore be of interest. The purpose of this paper is to obtain such a distribu- 
tion from the circular universe (5). The statistic 7 is shown to be a consistent 
estimate of p within the limits imposed by the approximation. In addition, an 
approximate distribution for p = 2; + -:: + 2; in the case p ¥ 0 (which 
reduces to the exact chi-squared distribution when p = 0) is derived, together 
with all of its moments. 


3. We begin by asking about an approximate joint distribution of p and @ de- 
fined by 


2 2 
a+ +@ 


Mito + +++ + a7rKH. 


Pp 
q 


(8) 











82 R. B. LEIPNIK 


Defining ¢(u, v) as the expectation of exp[i(up + vg)], we have 


— _A(p) Cf |- a ( + _ os a 
o(u, v) = rere |. |, P| — aa ili— Qia’u) p 
-2(-2,, + woh) i -_ 
On integration, we find 


(10) (u,v) = A(p)[A(u, o) + 


where A(u, v) is the determinant of the matrix associated with the quadratic 
form within the curly brackets in (9). A(u, v) is a circulant; its value as deter- 
mined from the circulant formula ([2], p. 123) is 





(9) 








T 
(11) A(u, v) = [] € — 22 cos i) 
t= 
where y and z are defined by 
y= Ce on Qio’u 
1— ? 
(12) 
z= —P + ig’. 
1 — p? 


To get an approximation A(u, v) to A(u, v) we smooth log A(u, v) by Koopmans’ 
method. We have 


? 
(13) log A(u, v) = >> log € — 2z cos am) ‘ 
t=1 
We define A(u, v) through 
(14) log A(u, v) = [ log (u — 22 cos =) dt 
0 


in which the summation in (13) is replaced by integration. The integral in (14) 
is easily evaluated ([6], p. 65) giving 


2. As\* 
(15) Alu, v) = (<a 
Incidentally, had we used g, = %1%141 + +++ + %r4r+z in place of G1 = Gin (9), 
we would have obtained the same ean (15) for A(u, v). 
Setting }(u, v) = X(p)[A(u, vy? we may determine i\(p) by the requirement 
¢(0,0) = 1. Asimple calculation yields the result \(p) = (1 — p)~"’”. (Note 


A(p) _ 
that —— a3 


appears as 


(16) ina «hee =. 


— p’ is close to 1 for large values of 7). Our result for (4, v) 


SERIAL CORRELATION COEFFICIENT 83 


The approximate joint distribution of p and g may be written as the double 
Fourier integral 


x +00 poo em) ei 
(17) Dp, = _ _f[. exp [—<(up + 09)] (et ve=#) du dv 
which we evaluate “~ 576.3, 914.3) by changing integration variables from 
u, v to y, z and integrating out y and z successively. We obtain finally 


274 — ,2\)-T/2 
D(p, 9) ae * eet oN» —(T/2)—1 
r(4a)r = 
™ aor (7 +3) 
1 


“exp | - II — PF) {1+ p)p — 200} | . 


Changing variables from p, 7 = pf to p,7, we obtain for F(p, 7), the approximate 
joint distribution of p and 7, the expression 


274 2)1-(7/2) 
Pe, = 5. OTe 
(19) Q(3)r ¢ > 5) 


(p> — oy" 


pio-tes car pyre 


-exp| - — {1 Sad p- . zor |. 


We could also have derived (19), following Madow, by noting that for p = 0, 
p and 7 are independently distributed, p having the chi-squared distribution and # 
having approximately the Dixon distribution (7), and that p and 7 are sufficient 
statistics for the estimation of p and o’. 


4. The approximate marginal distribution R,(#) of 7 is obtained by an easy 
integration from (19) 


a 274 — 2y)—-(7/2) 
Ri) = [ Fo, Aap = 5 et el a Ar 
rar (7 +5) 
[ TI2-l oy acim BE * — apr} |a 
(20) P xp 207(1 — p?) p p D> 
r(2 +1) 
R,(7) = ie (1 — #7771 + 9? — Qp7)-7”. 
rawr (7 +5) 


Our notation is consistent since R,(7) indeed reduces to the Dixon distribution 
forp = 0. R&,(#) has a maximum when 


P= fom = ppp (+ MT - 1) - VT = DA oF tO. 








84 R. B. LEIPNIK 


A little manipulation shows that 1 > | fmax | > | | and that fuga, = p asymp- 
totically. A graph (Fig. 1) of 2,(7) for T = 20, p = 0, .2, .5, .7, .9 is appended 
from which it is seen that for | p| near 1, the distribution becomes highly con- 
centrated about7,,,. On differentiating R,(7) with respect to p and eliminating, 
the envelope of the #,(7) is seen to be 


i. 
rar($ +5) +i) 











“— “3 =—B =F =“6 -=8 =-4 <3 <2 =) 0 | a a a S 6 7 8 9 10 


Fig. 1. Graph of the Distribution of the Serial Correlation Coefficient in a Circular 
Universe, for TJ’ = 20 


5. Before evaluating the moments of R,(7) we will pause to obtain the ap- 
proximate marginal distribution P,(p) of p, and its moments. We write 


1 274 _ .2)7-7/2 
P,(p) = [ Fo, nar =? BeG— oN 


1 2° ‘ 7, .j 
asi rors +3) 


opt! exp |- Pp i+ p. LL ‘(1 p2yt/2-4 exp| af pp _|@ 
207 \l — p? 2(1 — p?) 


If we define J,(z), the Bessel function of order »v and purely imaginary 
argument by 








SERIAL CORRELATION COEFFICIENT 85 


e2 —a a. 


2» nil(y + n +1)’ 
we obtain ([8], p. . if p+ 0 


— Pe —1 pom Pp 1 + p alli 
(23) P,(p) = 5 P ep | 2o3 (G+) Isp (=a - a) 


and if p = 0 


(20”)- yo , oa. 
(24) Py(p) = “(ty exp | ZI. 





on performing the integration indicated in (21). P,(p) coincides with the 
exact distribution Po(p). An expression covering all moments of P,(p) is 
obtained from (16) by setting v = 0, differentiating, and setting u = 0. Wehave 


2 \ —7/2 
. . yr y — | = | 
” $(u, 0) = Kip) (vr v7-[ 5] A aey 


hence 





ie (—20)*(1 ‘. ire 


um 


> Tt 2 7 —T/2 | 
; z("* V/ = i - ol) 
dy* 2 : 
From (26), we readily find 


(27) Ep) = To’, B| 2 = 9? 


(To)? + 2To ‘(j +8) 


k 
E{p') = <* 5 ou, 0) 





(26) 








y=(1+p")/(1—p?) 


E{p’] 





2 1+p 2 2c f1 +p 
a, = 27 ‘(+8), Toit = ar a e 
Thus the unbiased character of p/T as an estimate of o’ is reflected in the ap- 


proximate distribution, while (28), which shows that lim @’,,, = 0, indicates 
that consistency is also reflected. — 


(28) 





6. We now calculate the momentsof R,(7). Interchanging the order of integra- 
tion in the expression for E[F*] is justified by the uniform convergence, so we have 











86 R. B. LEIPNIK 


Bir [+ Al ftp, ridp jar = [| [ #e Fp, # dr | ap 


T [207(1 —_ 12071 — p?)|-7? [ pe 


(29) ~ 92 
rG)r(F+3)" 
- exp | -2, (; = “| 1a PL — #)7-"? exp (mi) ar} dp 


where mm is defined by 





~~ pp 
(30) m= Ai" 
Defining G(m) by 
+1 
(31) Sad = [ (1 — #)71” ex (ma) dF 
— 


we have ([8], p. 79) 


_{m ‘ond Ir2(m) 
(62) avon (2) r (5) r ¢ +1). 
2 2 2 
Differentiating each side of (32) k times, we find by (31) and (32) 
ke +1 
< G(m) = [ (1 — #)7?-™? exp (mr) dF 
1 


(33) Qri te 
= r)r (222) = [m - T7)2(m)]. 
(3) : - 


[27 ,(z)] = 27 Ty4s(z) 


Using the identity ((8], p. 79) 
a 
dz 

and changing the integration variable in (29) from p to m, we obtain finally 


a T iid 2 o 1 2 es aie 
(34) El] = 9 p “i I m™ * exp (—s te) aut [m ™" Tr41(m)] dm. 





For k = 1, we have ([8], p. 386) 
p 


i‘ 
l+5, 





Me) — 
(35) ral 
For k = 2, after some tedious calculation, we find 
me. St p T(T +1) 
l= 553+ TERT SS 


3. _! [1 - rea | 
" (T + 2)(T + 4) 1° 


(36) 


one 








SERIAL CORRELATION COEFFICIENT 87 


’ We note that lim E(#) = p and lim a; = 0, so that at least to the extent of 


T-0 T—<© 


approximation furnished by R,(7), 7 is a consistent estimate of p. 


_ The author wishes to express his gratitude to Dr. T. Koopmans, under whose 
kind direction this paper was written. 


REFERENCES 


[1] T. Koopmans, ‘‘Serial correlation and quadratic forms in normal variables’’, Annals of 
Math. Stat., Vol. 13 (1942), pp. 14-33. 

{2] W. J. Dixon, ‘‘Further contributions to the problem of serial correlation”’, Annals of 
Math. Stat., Vol. 15 (1944), pp. 119-144. 

[3] H. Rustin, ‘‘On the distribution of the serial correlation coefficient’’, Annals of Math. 
Stat., Vol. 16 (1945), pp. 211-215. 

[4] R. L. ANDERSON, ‘‘Distribution of the serial correlation coefficient”’, Annals of Math. 
Stat., Vol. 13 (1942), pp. 1-13. 

[5] W. G. Mapow, “ Note on, the distribution of the serial correlation coefficient”, Annals of 
Math. Stat., Vol. 16 (1945), pp. 308-310. 

[6] B. O. Petrce, A Short Table of Integrals, Ginn and Co., 1929. 

[7] G. A. CAMPBELL AND R. M. Foster, Fourier Integrals for Practical Applications, Bell 
Tel. Tech. Pub., 1942. 

[8] G.N. Watson, A Treatise on the Theory of Bessel Functions, Second Rev. ed., Cambridge 

University Press, 1944. 


CONCERNING THE EFFECT OF INTRACLASS CORRELATION ON 
CERTAIN SIGNIFICANCE TESTS 


By Joun E. WaAtsH 


Princeton University 


1. Summary. In practical applications it is frequently assumed that the 
values obtained by a sampling process are independently drawn from the same 
normal population. Then confidence intervals and significance tests which were 
derived under the assumption of independence are applied using these values. 
Often the assumption of independence between the values may be at best only 
approximately valid. For some cases, however, it may be permissible to assume 
that the correlation between each two values is the same (intraclass correlation). 
The purpose of this paper is to investigate the effect of this intraclass correlation 
on the confidence coefficients and significance levels of several well known 
confidence intervals and significance tests which were derived under the assump- 
tion of independence, and to extend these considerations to the case of two 
sets of values. 

In the first part of the paper the relations given in Table I are used to compute 
tables which show the effect of intraclass correlation on the confidence coefficients 
and significance levels of the confidence intervals and significance tests listed in 


Table II. The second part of the paper consists of the proofs of the relations 
given in Table I. 


2. Introduction. Let the n values 2%, ... ,%, represent a single value of a 
normal multivariate population for which each of the n variables has mean p 
variance o, and the correlation between each two variables is p. These n 
values will be called a correlated ‘“‘sample.”” The values 1,---,2, and 
Yi, °** » Ym are said to represent two correlated ‘‘samples”’ if they have a normal 
multivariate distribution such that the x’s have mean uy, variance o’, correlation 
p, the y’s have mean yu’, variance o”, correlation p’, and the correlation between 
each x and y is p’’. This paper shows that several well known quantities which 
have Student ¢, x’, or Snedecor F distributions when the values form random 
samples still have these same distributions for correlated “‘samples”’ if the quanti- 
ties are multiplied by suitable constant factors, where it is to be remembered 
that for normal populations a eorrelated “‘sample”’ is a random sample if and 
only if p = 0 and that two correlated “‘samples” represent two random samples 
if and only if p = p’ = p’’ = 0. The quantities considered and the corresponding 


™ 


factors are listed in Table I, where = >> x;/n andg = >, ya/m. Several com- 
1 1 


monly used confidence intervals and significance tests based on these quantities 
and derived under the assumption of randomness are considered, and tables are 
computed which show how the confidence coefficients and significance levels of 


88 





EFFECT OF INTRACLASS CORRELATION 89 


’ these confidence intervals and significance tests vary if the values are from 
correlated “samples” instead of random samples. ‘Table II contains an outline 
of the confidence intervals and significance tests considered. It is found that 
these confidence coefficients and significance levels can change noticeably when a 
correlated “sample” is considered. This is particularly true for the Student 
t-test. For example, in one case it is found that if the sample size is 32 and the 
significance level is .05 when p = 0, then the significance level becomes .23 for 
p = .05. This large change in significance level for a small change in p is ex- 
plained by the factor given for the Student ¢-distribution in Table I. This 
shows that test results which appear to be “‘significant”’ under the assumption of 
randomness are not necessarily “significant”? when correlation is present, even 
though the amount of correlation may be small. The effect of correlation on the 


TABLE I 





iS Factor Multiplying 
. Distribution For Sas 
Quantity Statistic for 
Random Sample Correlated “Samples” 








(@ — nz) V n(n — 1) - ( — p) Vn(n — 1) Student t-distribution / 


l—p 
/ gn—1(t) dt 
(2; — £)2 
a x Z) 


1 + (n — 1)p 
| x?-distribution 
$n—1(x*) dx? 





Snedecor F-distri- 
bution 


hn—1,m-1(F’) dF 





x’ and Snedecor F tests is not as great as for the Student t-test as can be seen from 
the factors given for the x” and Snedecor F distributions in Table I. 


3. Effect of intraclass correlation. The relations stated in Table I will now 
be used to investigate the effect of intraclass correlation on the confidence co- 
efficients and significance levels of several common types of confidence intervals 
and significance tests which were derived under the assumption of random 
samples. The confidence intervals and significance tests considered are listed 
in Table II, where S’ and S” are defined in Table I. These particular confidence 
intervals and significance tests have the property that if a is the confidence 
coefficient of the confidence interval listed for a given statistic, then 1 — a is 
the significance level of the significance test listed for that statistic, this relation 
holding whether random samples or correlated ‘‘samples’’ are considered. For 
this reason the tables given in this section will be limited to confidence coeffi- 











90 JOHN E. WALSH 


cients; the corresponding significance levels can be obtained by using the above 
relation. 

a. Student t-distribution. Ifarandom sample of sizenis drawn from a normal 
population with mean u and variance o° (denoted by N(u, o’)), a confidence 
interval for u with confidence coefficient € is given in Table II. If the n values 
form a correlated ‘“‘sample”, however, it follows from Table I that the cor- 
responding confidence interval with coefficient ¢€ is 


_ 1+ (n—1)p Z _1+(n = 1)p 
2 08 4/Oe S as 8+ U8 eT oy 








TABLE II 
Para- . 
Stat- | meter Confidence Interval Significance Test Definitions of 
istic — (Confidence Coefficient e) (Significance Level = 1 — e) Constants 
ine 





. i ane tte |@—p| r 
n(n — 1) ~ < S/n aT) | i 





gn—1(t) dt = ¢ 
te 























“2s tS | 
i as 
7 V/ n(n — 1) 
x? o? Ose S S?/x? S? = 
> 1/xé o In—1x?) dx? = € 
| o Xe 
sseetiieeaeal | _ 
F | o | OSo%o2< S/S°F, | oS” | 
a i = 1/F. | Riese) dF = © 
| a* o 28! F 





The confidence interval given in Table II can be rewritten as 


os it+t-— «<2 1+ (m= I)p 
m 68 4/ A — 1)(1 — p) Sustrhs / a — 1)(1 — p)’ 


where 
'~*¢ 
i. 24 a/ 
y/ 1+ (n— lp 


Hence if p < 0, a > € and the confidence coefficient of the confidence interval 
in Table II is greater than e. This means that the significance level of the 
corresponding significance test listed in Table II would be less than 1 — € so 
that any test result which would be significant for a random sample would also 
be significant for a correlated “‘sample” for which p < 0. If p > 0, however, 
€ > a and the significance level of the test would be greater than 1 — «. Thusa 
test result which would be significant for a random sample need no longer be 
when p > 0. The effect of positive values of p upon the confidence coefficient 
a = a,(p, n) of the confidence interval of Table II is given in Table III for the 
cases € = .95 and .99. Confidence intervals with unequal tails can be treated 





_ @ 28 @® 


~ 


es OD 





EFFECT OF INTRACLASS CORRELATION 91 


‘jn a similar manner. It is thus seen that the effect of correlation on the con- 


fidence coefficient increases with the sample size n, and that even a very small 
amount of correlation can cause a large change in a. For example, for samples 
of size 16 a correlation of p = .05 will change the significance level from .05 to 
.135; for samples of size 32 a correlation of p = .05 will change the significance 
level from .01 to .102, and from .05 to .23. 

Confidence intervals for » — y’ are given by Theorem 5 of section 4. It is to be 
observed that if p = p’ = p”’ and o = o’ the confidence coefficients are inde- 
pendent of pando. Ifm =n, p = p’,a = 0’, p’’ = 0, however, the confidence 
coefficients of the confidence intervals for 1 — yp’ have the values a = a;(p, n) 
given in Table IIT. 


TABLE III 
Values of a:(p, n) 




















a 0 05 1 2 3 4 5 
cing ace, sacar aa tn ais Naini acc cael 
4 99 983 974 -961 944 920 
95 921 890 -855 805 744 
8 99 959 .913 853 790 
95 865 .767 620 
16 .99 .903 .795 .690 -600 .515 
.95 .865 74 .64 .54 
39 .99 .898 79 .63 
.95 aa .68 
64 .99 .79 
128 .99 .68 





b. x’-distribution. If a random sample of size n is drawn from N(y, o°), a con- 
fidence interval for o” with coefficient ¢ is given in Table II. If the n values form 
a correlated ‘‘sample”’, it follows from Table I that the corresponding con- 
fidence interval with coefficient « is 


O<o S S*/xX(1 — p). 
The confidence interval in Table II can be rewritten as 
0<o0 S S’/xi(1 — p), 
where 
2 2 
Xa = x-/(1 — p). 


Hence if p < 0, a > e€ and the significance level of the significance test given in 
Table II is less than 1 — «. If p > 0, the significance level of the test is greater 








92 JOHN E. WALSH 


than 1 — e. The effect of positive values of p upon the confidence coefficient 
a = ay,2(p, n) of the confidence interval listed in Table II is given in Table IV 
for € = .95 and .99. Cases in which the lower limit of the confidence interval 


is not zero can be treated in a similar manner. Table IV shows that the con- 
fidence coefficient a = a,2(p, n) decreases with the sample size n for a fixed value 
of p. Although the effect of correlation for the x’-distribution is not as great as 
for the Student ¢-distribution, it does cause a noticeable change in a. For 
example, for samples of size 16 the significance level of the test in Table II is 
changed from .05 to .081 if p = .1 and from .05 to .13 if p = .2. For samples of 
size 32 the significance level is changed from .05 to .10 for p = .1 and from .05 to 
.19 for p = .2. 

c. Snedecor f-distribution. If two random samples, one of size n (denoted 
by 2’s) and the other of size m (denoted by y’s), are drawn from N(u, o’) 
and N(u’, o”) respectively, a confidence interval for o/c” with coefficient « 


TABLE IV 
_ Values of ax2(p, n) 














wg i 0 al | 4 | 3 4 5 
sitintiatttee | = ocean 
‘ .99 .988 ata .986 | 983 | .979 971 
| 95 941 | .930 918 | .900 .872 
aja ccimapaniciaalaiamisclaiieg a ecient aetetall seas idini deine etitatieinsaaanaeiiemNtN 
.99 .982 966 | 941 .890 .790 
16 
.95 .919 | .87 | .79 | .67 49 
oe jennie matrtnaemmeseeleenee aa 
39 .99 975 | .946 | .867 715 44 
95 90 | 81 | 64 38 17 








is given in Table II. If the values form two correlated ‘‘samples’”’, however, 
it follows from Table I that the corresponding confidence interval with coeffici- 


ent ¢€ is 
2 af (1 Pt ey, / 
ole" § gam p) / Fe 


The confidence interval in Table II can be restated as 
-S (1 — o’) 3 / 
< 2, 72 
0<0/oe = ST — ») Fac 
where 
F, = Fl — p’)/Q — p). 


Thus if p = p’, a = ¢€ and the significance level of the significance test given in 
Table II remains equal to 1 — e. If (1 — p’)/(1 — p) < 1, a > € and the 
significance level is less than 1 — e«. If (1 — p’)/(1 — p) > 1, however, a < € 





~~ Ss fF TO 













EFFECT OF INTRACLASS CORRELATION 93 


. and the significance level is greater than 1 — e. Values of the confidence 
1— 9’ , : ‘ 

= ; »%, m) of the confidence interval listed in Table II are 
given in Table V for « = .95 and .99. Cases in which the lower limit of the 
confidence interval is not zero can be treated in a manner similar to that given 
above. Table V indicates that the effect of correlation on the confidence 
coefficient is not as great forn < masforn >m. For example, ifn = 4,m = 32, 


TABLE V 
Values of ap (; al > ) 
s-~? 


coefficient a = ar 


























1 1.25 15 2.0 
4 4 99 987 -983 975 
95 933 -916 880 
16 4 99 978 -962 917 
95 912 .869 778 
39 4 99 975 -952 896 
95 906 858 753 
4 16 99 987 -985 977 
95 933 -914 875 
99 973 -945 858 
“ ” 95 892 817 637 
99 919 .837 628 
™ - 95 869 -763 518 
99 -987 -985 977 
. - 95 931 .913 874 
99 960 .893 675 
” - 95 850 707 400 
1 — p’ or aa : j : 
i = 1.25, the significance level of the significance test given in Table II is 
“sy 
1 — p’ 
only changed from .05 to .069, if i P = 1.5 from .05 to .087. Ifn = 32,m = 4, 
“a 
1 — p’ — ‘ : 
i P= 1.25, however, the significance level is changed from .05 to .094, if 
= 2 
i- — i~ Jf 
i P = 1.5 from .05 to .142. Also it is seen that for fixed — , the effect of 
—p —p 


intraclass correlation increases with both n and m. 








94 JOHN E. WALSH 


4. Analysis. This section contains derivations of the relations stated in the 
first three sections. The method used in these derivations is similar to that used 
in one approach to the analysis of variance and consists essentially in expressing 
each variable as the sum of two quantities, one of which is the same for each 
variable and the other of which is different for each variable. 

Let 21,-°°:+,2%, represent a correlated “sample”, that is, have a normal 
multivariate distribution for which 


E(x;) = p, (¢ = 1,--- ,9) 
(1) E{(x; — u)"] = o° 
E(x; — u)(a; — w)] = po’, oes 6, >>> ee. 
Write the z;, (¢ = 1, --- ,n), in the form 
=nt+tN+&, 


where — = >) £;/n and n, fi, «++ , £, are independently distributed, 7 according to 
1 


N(u, o;) and the £; according to N(0, o¢). The values of A, 0; and o; are chosen 
so that the x; = 7 + AE + &; satisfy (1). It is easily proved that it is always 
possible to choose A, o, and of so that (1) are satisfied. It is to be remembered 


that p = — 1/(n — 1) for intraclass correlation. From relations (1) and 
= n+ rE + £; it follows that 


(2) E(é) o o (1 — p), (2 = i, i ,N). 


1 = 2 ee ' 
THEOREM 1. The quantity wi-2 >» (x; — £)° has a x‘-distribution with 
—— 1 


n — 1 degrees of freedom and is distributed independently of z. 
Proor. Since the &; are independently distributed according to the same 
normal distribution with zero mean, it follows from (2) that 


Be} B= QTV Ee- 


has a x’-distribution with n — 1 degrees of freedom and is distributed inde- 
pendently of = » + (1 + NDE. ae 
(z Saws wv n(n _ a3 [| / $e (x; a" » has a Student t-dis- 


V1 + (n — Vp 
tribution with n — 1 degrees of freedom. 


"THEOREM 2. 


(E — w)Vn_ 
oV/1+(n—1)p 
has the distribution N(0, 1). Theorem 2 is then an immediate consequence of 
Theorem 1. 

Up to this point a single correlated “‘sample’’ of size n has been considered. 
The next part of the analysis, however, will be concerned with properties which 
arise from the consideration of two correlated “samples.” 


Proor. Itis easily seen from elementary considerations that — 




























Cee OO 





EFFECT OF INTRACLASS CORRELATION 95 


Let %1,°°* » Uny Yt, °** » Ym have a joint normal multivariate distribution 
such that 
E(z:) = pn, (¢ = 1, +++, n) 
E(ya) = pv’, (a = 1,-++,m) 
E((z; — ») =o 
(3) El(ya — n’)"] = o” 
E(x; — u)(z3 — w)] = po’, G§#j=1,---,n) 


El(ya — v')(ye — w')] = 'o”, = (a # B = 1,--- ,m) 
El(x; — L)(Ya — u’)| = pao". 
Write the x; and yq in the form 
ti = 1 + ME + Dek’ + E 


(4) a ee 
n + ME + Ae’ + E., 


Ya 


where ~’ = Zz £,/m and 2, n’, f1,°°* » En, bt, °°° p km are independently 
1 


distributed, n according to N(u, 0;), n ‘ according to N(w’, o,), the é; according to 

N (0, a), and the & according to N(0, oy’). The quantities 1, Az, » “as 

o;, 0, , 0¢, o¢ are chosen so that the x; and yq satisfy (3). It is easily verified 

that it is always possible to choose these quantities so that the x; and y. con- 

structed in this fashion satisfy (3). In addition it follows from (3) and (4) that 
E(&) = o°(1 — p) 


) al 
Elf) = o"(1 — p’). 


THEOREM 3. a7 = (x; — £)° and ae > (ya — 9) have x’- 
1 


distributions with n — 1 aed m — 1 degrees of freedom respectively, and are dis- 
tributed independently of each other and of & and 4. 


Proor. From Theorem 1 and (5) it follows that ~.~—— ol 5) > (x; — #)* 


1 
and 71 — 2) > (ya — 9)” have x’-distributions with n — 1 and m — 1 degrees 
= 1 
of freedom respectively. That they are distributed independently of each other 
and of both < and @ follows from (4). 
o”(1 — p’) Zz. (2; — 2) 
THEOREM 4, —___________. is distributed according to the Snedecor 
(1 — p) x (Ya — g)° 
F-distribution ha», m-(F)dF. 
Proor. This follows from Theorem 3. 





96 JOHN E. WALSH 
‘THEOREM 5. 


[(@—-9) —(u—v’)]Vn+m— 2 / jf > (a; — 2)’ 4 > — 9) 
- 1—p) o6(1 — »’) 
where 


of =“ [1 + (n— Io] + LL + (m — p'] — 2p"oe’, 
n ™m 


has a Student t-distribution with n + m — 2 degrees of freedom. 
1 
Proor. It is easily seen from elementary considerations that ((% — g)- 
1 


(u — p’)] has the distribution N(0, 1). Theorem 5 then follows from Theorem 3. 
The author wishes to express his appreciation to Professor John W. Tukey for 
valuable assistance and advice in the preparation of this paper. 





ON FAMILIES OF ADMISSIBLE TESTS 


By E. L. LEHMANN 
University of California, Berkeley 


1. Summary. For each hypothesis H of a certain class of simple hypotheses, a 
family F of tests is determined such that 

(a) given any test w of H there exists a test w’ belonging to F which has power 

uniformly greater than or equal to that of w. 
(b) no member of F has power uniformly greater than or equal to that of any 
other member of F. 

The effect on F of various assumptions about the set of alternatives are con- 
sidered. As an application an optimum property of the known type A, tests is 
proved, and a result is obtained concerning the most stringent tests of the 
hypotheses considered. 


2. Introduction. In the theory of testing simple hypotheses, if a uniformly 
most powerful test exists, it is the most desirable test to use. If, as is generally 
the case, such a test does not exist, the choice between tests none of which is 
“altogether better” than all the others, has to be based on information not con- 
tained in the general formulation of the testing problem. If no such additional 
information is available, the choice must of necessity be somewhat arbitrary. 

Now although a single uniformly most powerful test exists only in exceptional 


cases, there will always exist a family F of tests such that 
(a) given any test w of the hypotheses H under consideration and of prescribed 
level of significance, there exists a test w’ belonging to F which has power 
uniformly greater than or equal to that of w. 
(b) no member of F has power uniformly greater than or equal to that of any 
other member of F. 
The family F is essentially unique. Arbitrariness occurs only since a test region 
is not uniquely determined by its power function. But since two tests with the 
same power function are equivalent for testing purposes, it is from the present 
point of view immaterial which one is included in F. 

With the same restriction F is essentially the family of admissible tests, a 
test w being admissible if there is no test of the same level of significance which 
has power uniformly greater than or equal to but not identically equal to that of 
w. This definition differs only trivially from the one given by Wald [1, p. 15] 
who defines a test w to be non-admissible if there exists a w’ with power every- 
where greater than that of w (except at the hypothetical point). 

F naturally depends on the class of alternatives considered. A restriction 
in the class of alternatives may (although it will not necessarily) diminish F. 
The family F may also be decreased by other additional information: For 
instance a probability distribution may be assumed for the set of alternatives, 
and some properties of this distribution may be presupposed. 

97 











98  E. L. LEHMANN 


The determination of the family F, (and a description of the power functions 
of the tests in F) might be considered a solution of the testing problem. The 
solution is not unique and hence does not provide a basis for action. This 
reflects the fact that additional information is needed to make possible the 
unique choice of a best test. On the basis of the available information, F repre- 
sents the furthest reduction of the problem that seems possible. On the one 
hand, if the choice of test is to be made from the point of view of power, the only 
contestants for “‘best test’’ are the members of F. On the other hand, the 
available information does not give preference to any one member of F over any 
other unless additional principles (such as unbiasedness for instance) are 
introduced. 

It is the purpose of the present paper to illustrate the above notions by deter- 
mining F for a very simple case. 


3. Determination of the family F. Let the random variable 
E = (X1, X2,+** 3 Xn) 
have a probability density function 
(1) De 


depending on parameter 6. Concerning (1) we shall make the assumptions 
under which Neyman [2, 3] has shown the existence of the type A: test of the 
hypothesis 


(2) H:6= A ° 

ASSUMPTIONS: _ 

(a) Conditions of regularity: 

The integral 
" € = (x1, -°*, 2s) 
(3) [ pele) de de = dx, -++ dz, 


extended over any region w in the sample space, admits of two successive deriva- 
tives with respect to 6 under the integral sign, i.e. 


a" 3 
(4) — | ple) de = — pe) de for k = 1,2. 
de w w 0o* 
(b) A differential equation: 
If 
O 
(5) gole) = 5, log poe) 


, 0 
¢e(e) = 99 ¢o(e), 





FAMILIES OF TESTS 99 


gp, is not identically zero, and there exist functions of 6 (but independent of e), 
A and B, such that 


(6) vo =A+ By. 


Under these assumptions Neyman has shown 
A. that the probability density function p, is of the form 


(7) pele) = exp {P(6) + T(e)-Q(6) + R(e)} 


where Q is a monotone function with 5 Q(6) |ems, ¥ 0 (without loss of generality 


we shall assume Q monotonely increasing) and 
B. that the type A; test of the hypothesis H exists, and is given by 


(8) Te)<a, Tepe 


for suitable choice of c, and c2. 

In what follows we shall assume that the permissible first kind error in testing 
H is fixed throughout and has the value e. By a test w of H we shall always 
mean a test of level of significance e, i.e. satisfying 


(9) [ Do,(e) de = e. 


Let us consider the family of tests 
(10) w(k): T(e) < k, Tle) > flk);k < f(k) 


where f(k) is determined by (9). It easily follows from (9) that k can take on 
all values from — © to kp , say, where kp is such that 


(11) f(ko) = + @. 


For the family F of tests {w(k)}, — © < k < kyo we now state 

THEOREM 1. All members of F are admissible, and if w is any admissible test 
not in F, there exists a member of F which has power identical with that of w. 

We first prove the 

Lemma. Let By denote the powerfunction of a testw. Then if ki < ke 


B w(K) (0) < B woke) (9) fa< &% 
Bwk) (0) > Burk) (8) ifO> %. 


Proor: Let @ denote the complement of a region w. Consider the intervals 


(12) 


I = wlky) + w(ke) 


(13) 


I lies entirely to the right of J. Let @ > 6). Then 











100 E. L. LEHMANN 


PC) oy , 
(14) le) = C00) &P {TCQEO) — QA) 


is a strictly increasing function of T since Qisincreasing. Therefore there exists a 
constant C such that 


PU) — oO if TE) isin J 
oA) 
- (e) 
Y< ee if T(e) is in I. 
Since 
(16) [ rale)de = [  pog(e) de 
w (kj) w (ke) 
we have 
(17) [ pale)de =| pale) de 
Teel Teel 


and therefore 


as) f pole) de <C- f mle) de=C- f pale) de < f pole) ae 


from which it follows that 


(19) [Ode < [i pale) de 


which is the desired result. 
Proor oF THEOREM 1. The proof consists of several parts. 


I. Let m be any real number, and assume that there exists a value of k such 
that 


(20) B w( 9) — 
(21) © (0) lanty = 


for w = w(k). Then w(k) has power uniformly greater than or equal to that of 
any other test satisfying (20) and (21). 

For m = 0 this becomes Neyman’s theorem stating that the type A test is 
also of type A1. The proof of the theorem however is independent of the value of 


(23) < Buo(6) lone 


and hence carries over to arbitrary m. 


Il. If there exists any test satisfying (20) and (21) then there exists a number 
k for which w(k) also satisfies (20) and (21). 























FAMILIES OF TESTS 101 


To prove this let us determine, of all tests satisfying (20), the one which 
maximizes 


d 
(24) de Bw(8) \oma, = / So Po(e) lomo, de. 


This can be done by means of the lemma of Neyman and Pearson [4, p. 11] 
which gives sufficient conditions for a region w, subject to restrictions 


(25) [ #0 ae = a, G@=1,-++,p), 
to maximize an integral 
(26) | g(e) de. 


According to this lemma the desired test is of the form 


S 0 ' 
(25) a0 Do(€) |omby > O° Pa, (e) 


provided a value of a exists for which this test satisfies (20). (25) is equivalent to 


(26) P'(4) + Tle) - Q’'() >a from (7) 
or, since Q’(@) > 0, to 
(27) T(e) > b. 


Thus, if a number b exists such that the test (27) satisfies (20), this test is the one 
maximizing (24). But such a number does exist, namely f(— «<). Therefore 
w(— ) is the desired test. 
Similarly it is easy to show that of all tests satisfying (20), w(ko) minimizes (24). 
But 


(28) © Be (0) lomo, = [ 


is a continuous function of k, and therefore takes on all intermediate values, 
which establishes IT. 
III. From I. and II. we conclude that given any test w there exists a member 


= 


pang, a 
<k,T>f(k) 00 Pale) lomo, de 


f of F which has power uniformly greater than or equal to that of w. For let w 

be any test of H. From the condition of regularity it follows that its power- 
. function has a derivative at #. By II. there exists a value of k such that the 
f powerfunction of w(k) has the same slope at 4, and from I. it follows that 


w(k) is uniformly more powerful than w. 

But from the lemma we see that none of the tests w(k) is uniformly more 
powerful than any other. Hence all members of F are admissible, and the 
theorem is proved. 

. From the lemma and Theorem 1 we can conclude for all members of F the 
following optimum property: 








102 E. L. LEHMANN 


Coro.iary 1: Let w be any test, and let wo be any member of F. Then at least 
one of the two statements 


B (8) < B wo (8) for all@< % 
B w(8) < Bw (8) for all 6 > i 


(29) 


must hold. 
The lemma and Theorem 1 also give the following result concerning most 
stringent tests, defined by Wald [1, p. 33]. 
Coro.uary 2: There exists a uniformly most powerful of all most stringent tests. 
It is that unique member wo of F for which 
lub. P1.u.b. 8,,(0) — Boo) | = lub. [1ub. B.(0) — Bw9(6) | - 
@>089 w 


6< 89 w 


4. The effect on F of assumptions about the alternatives. Let us next consider 
how a restriction in the set of alternatives effects the family F. From the lemma 
it follows that there is no change as long as the set of alternatives contains 
values of 6 both greater and less than 6). On the other hand, if the alternatives 
are restricted to values of 6 greater than 6), say, the family F for testing H 
against these alternatives consists of only a single member, the test w(— ~), 
(and similarly for the other onesided case). This follows from 

THEOREM 2: Under conditions a. and b. the test w(— ©) is uniformly most 
powerful against the alternatives 0 > 6, the test w(ko) is uniformly most powerful 
against the alternatives 0 < 0. 

Proor: Let w be any test. By Theorem 1 there exists a number k such that 


(30) Bw(9) < Buca) () for all 6. 
From the lemma it follows that 

B wk) (4) < B w(ko) (8) if @0< % 
(31) 

B we) (8) < B w(—s) (8) if 6 > 4. 


Combining (30) and (31) we have the desired result. 
(It is also easy to prove Theorem 2 directly from the Neyman-Pearson lemma.) 
In order to illustrate how the assumption of an a priori distribution of @ 
together with some information about this distribution affects F’, let us consider a 
special case of the class of hypotheses discussed so far. 


Let 
(32) Po(a1 ee te = ¢ - ghee 
so that H = (Xi, X2,--- , X,) isa sample from a normal distribution with unit 


variance and unknown mear. We want to test the hypothesis 
(33) H:¢6=0. 


We shall show that if @ has a probability density function g which is symmetric 





FAMILIES OF TESTS 103 
about the origin, then the family F for testing H consists, as might be expected, 


of a single member, the type A, test. 
Our problem is to find the test w satisfying 


(34) [ Po( 21, ore, Za) dx, cee dz, =e 
and which maximizes 
(35) [ g(0) [ Do(X1, rn Za) dx, es dx,- dé. 


Inverting the order of integration, which is permissible in this case, the Ney- 
man-Pearson lemma shows the desired test to be of the form 


(36) [, a0rputers «++, 25) dd > a-polty +++ 5 2) 


provided a value of a exists for which (36) satisfies (34). Substituting from 
(32), (36) becomes 


(37) ia) = | ge" do >a 
where 
(38) == : 2; 
j=l 
Since 
; a 
(39) apt ® - 


the region (37) is either empty, which would contradict (34), or else can be 
described by inequalities 


(40) E<a, >a 

where 

(41) f(a) = f(a) 

the latter equation becoming, on substitution from (37) 

(42) [ae "(ere — &") do = 0. 

If g is an even function, (42) is certainly satisfied when a, = —a,.. Our test 


then becomes 
(43) zt < —Qe, Z> a2 


which for proper choice of az satisfies (34) and is the well known type A; test. 











104 E. L. LEHMANN 


5. Concluding remarks. Let us consider once more a probability density 
function satisfying a. and b. We have seen that the family F for testing H 
against the alternatives @ ~ 4 contains an infinity of elements unless we make 
some additional assumptions. On the other hand, if the principle of unbiased- 
ness is accepted, F' shrinks to a single element: the type A: test. 

But unbiasedness does not insure power. Thus conceivably some other test 
might be more powerful than the test chosen, everywhere except in a small one 
sided neighbourhood of 6. That this is not so is shown by Corollary 1 to 
Theorem 1. This remark illustrates how intuitively appealing principles and a 
knowledge of the family F may be used in conjunction to arrive at a choice of 
a satisfactory test, when not enough information is available to make the choice 
compelling. 

Finally, it should be pointed out that although we restricted our considerations 
to simple hypotheses, the notions developed also apply to composite hypotheses. 


REFERENCES 


1] A. Wa tp, ‘On the principles of statistical inference’. Notre Dame Mathematical 
Lectures. Number 1. 

[2] J. NeyMAN, “‘L’estimation statistique traitée comme un probléme classique de prob- 
abilité’’. Conferences internationales des sciences mathématiques a Genéve: 
Colloque d’Octobre 1937, sur le Calcul des Probabilités, Paris, 1938. 

[3] J. Neyman AND E. S. Pearson, ‘‘Contributions to the theory of testing statistical 
hypotheses, Part II’’. Statistical Research Memoirs, Vol. II, London, 1938. 

[4] J. Neyman anv E. S. Pearson, “Contributions to the theory of testing statistical 
hypotheses, Part [’’. Statistical Research Memoirs, Vol. I, London, 1936. 





CONDITIONAL EXPECTATION AND UNBIASED SEQUENTIAL 
ESTIMATION! 


By Davin BLACKWELL 


Howard University 


1. Summary. It is shown that E[f(x) E(y | x)] = E(fy) whenever E(fy) 
is finite, and that o E(y | x) < o’y, where E(y | x) denotes the conditional ex- 
pectation of y with respect to x. These results imply that whenever there is a 
sufficient statistic wu and an unbiased estimate ¢, not a function of wu only, for a 
parameter 6, the function E(¢ | wu), which is a function of wu only, is an unbiased 
estimate for @ with a variance smaller than that of ¢. A sequential unbiased 
estimate for a parameter is obtained, such that when the sequential test termi- 
nates after 7 observations, the estimate is a function of a sufficient statistic for the 
parameter with respect to these observations. A special case of this estimate is 
that obtained by Girshick, Mosteller, and Savage [4] for the parameter of a 
binomial distribution. 


2. Conditional expectation. Denote by xz any (not necessarily numerical) 
chance variable and by y any numerical chance variable for which E(y) is finite. 
There exists a function of x, the conditional expectation of y with respect to x 
[3, pp. 95-101, 5, pp. 41-44] which we denote, as usual, by E(y | x) and which is 
uniquely defined except for events of zero probability, such that whenever f(x) 
is the characteristic function of an event F depending only on z (i.e. f = 1 when 
F occurs and f = 0 when F does not occur), the equation 


(1) Elf(z) Ely | x)] = Elf(@)y) 


holds. Now if f(x) is a simple function, i.e. a finite linear combination of char- 
acteristic functions, it is clear from the linearity of expectation that (1) continues 
to hold. Quite generally, we shall prove 

THEOREM 1: The equation (1) holds for every function f(x) for which E[f(x)y] 
is finite. 

To simplify notation, we write E(z | x) = E,z for any chance variable z. The 
following corollary to Theorem 1 asserts simply that the operations E, and 
multiplication by f(x) are commutative. This fact, which is trivially equivalent 
to Theorem 1, has been stated by Kolmogoroff [5, p. 50]. 

Corouuary: If E[f(x)y] is finite, then E,[f(x)y] = f(x) Ezy. 

Proor oF Coro.uary: If g(x) is a characteristic function, then E(gfE.y) = 
E(gfy) by Theorem 1. Since E,(fy) is unique, the Corollary follows. 

Proor or THEOREM 1: Since Theorem 1 holds when f(z) is a simple function 
and the product of a simple function and a characteristic function is a simple 
function, the Corollary holds when f(x) is a simple function. 


1 The author is indebted to M. A. Girshick for suggesting the problem which led to this 
paper and for many helpful discussions. 


105 











106 DAVID BLACKWELL 


Now let f(x) be any function for which E(fy) is finite. There is a sequence of 
simple functions f,(x) such that f,(x) — f(x) and | f,(x) | < | f(x) |. For instance 
we may define f,(x) = m/n when m/n < f(x) < (m + 1)/n, 0 < m < 1’, f,(z) 
= m/n when (m — 1)/n < f(x) < m/n,0 > m > —n’,f,(x) = 0 otherwise. 

We recall the following proposition of Doob [2, p. 296]: 


(2) | Ey |< Ez\y| 


with probability one. Then, using the Corollary (for simple functions) and 
(2), we have | f,Z.y | = | E-(fay) | < E.|fay| < E.|fy|-. Also 


(3) E(fnEzy) = E(fny). 


Since the two sequences of functions f,F,y, fxy are bounded in absolute value by 
the summable functions E, | fy | , | fy | , Lebesgue’s theorem [8, p. 29] applied 
to (3) yields (1). 

In section 3 we shall use the fact that if u is a sufficient statistic for a parameter 
6 and f is any unbiased estimate for 0, then E(f | u) (which, since u is a sufficient 
statistic, is a function of uw independent of @) is an unbiased estimate for @._ Thisis 
obvious, since it follows from the definition of conditional expectation that the 
two chance variables f and E(f | u) have the same expected value. ‘The interest- 
ing fact is that the estimate E(f | u) is always a better estimate for than f in the 
sense of having a smaller variance, unless f is already a function of wu only, in 
which case the two estimates f and E(f | u) clearly coincide. This is simply the 
fact that the variance of the regression function of f on u is not greater than the 
variance of f. In the case of Gaussian variables, where the regression is linear, 
this fact has been noted by Doob [1, p. 231].” Our statement is embodied in 

TuroreM 2: If oy is finite, so is o Ezy, and o E.y < oy, with equality holding 
only if Ezy = y with probability one. 

Proor: Denote by m the common expected value of y and E,y. Suppose for 
the moment that oE.y is finite. By the Schwarz inequality E[yE.y] is then 
finite. Then oy = E(y — m)* = Ef(y — Ey) + (Ezy — m)P = Ely — E.y)’ 
+ o E,y, since E[E,y(E.y — m)| = Ely(E.y — m)] by Theorem 1. Thus oy 
exceeds o Ezy by E(y — E.y)’, which is positive unless y = E.y, i.e. y is a func- 
tion of x. Thus we obtain the usual decomposition: the variance of y is the 
variance of the regression of y on x plus the variance of y about the regression of 
y on &. 

To show that o’E.y is finite, we require the following 

Lemma (ScHwarz INEQUALITY): If E(f’) and E(g’) are finite, then, with 
probability one, 


Ex(fg) < EAf)E.(g’). 


A proof can be constructed on the usual lines by considering the function 
Q(x,) = E.(f +g)”. There are, however, certain measure-theoretic difficulties 


2 For functions of finite variance it is possible to interpret conditional expectation as a 
projection in Hilbert space, when the statement becomes simply the Bessel inequality. 








CONDITIONAL EXPECTATION 107 


’ in handling simultaneously the conditional expectations of the family of chance 
variables (f + 2g)’; instead we shall give a simple direct proof based on the 
ordinary Schwarz inequality for integrals. 

We may suppose f > 0, g > 0 with probability one, since, from (2), 


Ex(f9) < EX\f \l9)) 


with probability one. Unless the Lemma holds there are three positive numbers 
a, b,c with a > be for which the event 


{E.fg>a’', EAf?)<b, Eg’) <c} =H 


has positive probability. Then denoting by h the characteristic function of H 
and using the Schwarz inequality for integrals, we have 


aP(H) < E*(hE.(fg)| = E°(hfg) < E(hf’)E(hg’) 
= E{hE.(f’)|E(hE.(g’)] < beP*(H), 


which is impossible. This completes the proof of the Lemma. 
The Lemma, with f = y, ¢ = 1, yields E2(y) < E.(y’) with probability one, 
which implies the finiteness of o°E,y and hence completes the proof of Theorem 2. 


3. Unbiased sequential estimation. Consider a chance variable z whose 
distribution depends on a parameter 6. If we have an unbiased estimate ¢(z) 
and a sufficient statistic w(z) (not necessarily a single numerical chance variable) 
for 6, then, as mentioned in section 2, v(uw) = E(t| u) is an unbiased estimate for @ 
depending only on wu.’ We have shown that the variance of v is never greater 
than that of ¢, and we shall see that it is sometimes much smaller (see example IT 
at the end of this section). The estimate obtained in this section for the param- 
eter of a sequential process is of the v type; its importance lies in the fact that 
in many cases there is an unbiased estimate ¢ (generally poor) which is a function 
of the first observation, and which will consequently be an unbiased estimate no 
matter what sequential test procedure is used. 

Let 2, 22, +--+ be a sequence of chance variables whose joint distribution is 
determined by an unknown point @ in a parameter space. A sequential sample 
(test) [9] is determined by specifying a sequence of mutually exclusive events 


S,, Ss,--+ , where S; depends only on 2, , --- , x; and 
(4) >» P(S;) = 1 for all 6. 
t=1 


The event S; is that sampling stops after the ith observation, and (4) ensures that 
sampling stops eventually. Thus if we define the chance variable n = 7 when S, 
occurs, n is the size of the sample. 


> It was pointed out by the referee that, strictly speaking, u does not have to be sufficient; 
it is necessary only that v(u) be independent of 86. The author is indebted to the referee for 
many valuable suggestions. 








108 DAVID BLACKWELL 


Denote by wu, w2,-°*- amy sequence of chance variables such that u; = 
u;(%1,°°-* , 2) is a sufficient statistic for estimating 6 from %,---,2;. There 
will of course be many such sequences {u;}, but it often happens that there is 
one which arises in a natural way from the sequential process; if we are sampling 
from a binomial population, for instance, u; = number of defectives in the first ; 
observations is a sufficient statistic. We shall suppose that the sequential test 
satisfies the following condition 


(6) S; = WC(Si + --- + Sid," 


where JW; is an event depending on u; only. This condition means that when 
the ith observation is taken, the decision to stop at this point depends only on 
the ith sufficient statistic u;. For the binomial example mentioned above, this 
means that the decision to stop after 7 observations depends only on the number 
of defectives observed at that stage, and not on the order in which they were 
observed. The Neyman criterion for u; to be a sufficient statistic [7, 10, p. 135] 
shows that (6) is no restriction whatever for the sequential probability ratio 
test [9] since the ratio in terms of which the test is defined will be a function of 
u; only. 


Let t: , &, --- be any sequence of chance variables such that ¢; is a function of 
%1,°+:+ , 2; ;define ¢ = ¢t; when S;occurs. If E(t) = 6, tis said to be an unbiased 


estimate for @ (relative to the particular sequential test {S;}). The theory of 
sequential sampling has been formulated primarily for testing hypotheses; a 
problem which arises naturally and often is the following: After a sequential 
sample has been obtained, is there an unbiased estimate for 6? Since a sample 
of constant size is a special case of a sequentially selected sample, we cannot 
hope to find unbiased estimates for arbitrary sequential samples unless such 
estimates exist for samples of every constant size. This is equivalent to the 
existence of a function ¢(x,) for which E(t) = @ for all 6. Our problem is to 
discover an unbiased estimate for 6 which, when n = 7, is a function of u; alone. 
Such an estimate has been found by Girshick, Mosteller, and Savage [4] for 
sequential samples from a binomial population. It turns out that whenever 
there is any unbiased estimate at all for a particular sequential test, there is 
also one of the type described. Thus, if there is an unbiased estimate ¢ for 
samples of fixed size N, there will be an unbiased estimate of the type described 
for every sequential test requiring at least N observations, since ¢ is itself an 
unbiased estimate for such sequential tests. 

Denote by ¢ any unbiased estimate for @ relative to a particular sequential 
test {S;}. Denote by w;, h; the characteristic functions of the events IW;, 
C(S,; + --- + S,) respectively, and define u = u,,v = E(hint; | u;)/E (hs! ui) 
when n = 7. To justify the definition of v we remark that the event {n = 1, 
E(hi1 | ui) = 0} has probability zero, since ghi1 < hi with probability one, 
where gq is the characteristic function of the event {E(hi | ui) > Of}, while 


4 For any event A, C'(A) denotes the event that A does not occur. 





le 





CONDITIONAL EXPECTATION 


E(ghi1) = ElgE (iin | us)] = E(B | us)] = EQ). 


Since u; is a sufficient statistic for @ with respect to 7, --- ,2;,vis a function of 
u and n only, independent of 6. The main result of this section is 

THEOREM 3. v is an unbiased estimate for 0. 

Proor: We shall show that v = E(t|u,n). This not only shows that v is an 
unbiased estimate for 6, but also interprets v in a very simple way and, as men- 
tioned above, implies that the variance of v does not exceed that of ¢. It must 
be verified that for every event D depending only on n and u, E(dv) = E(dt), 


where d is the characteristic function of D. Now D = }> DS,, and DS; = D,S; 
jo) 


where D; is an event depending only on u,;. It is sufficient, then, to show 
E(djwhiw) = E(diw hist), where d; is the characteristic function of D;. Now 


E(dwhiw) = Eldwhak(hit: | u)/E(hin | us], 


using the definition of v. The function in brackets is h;_; multiplied by a function 
of u; ; by Theorem 1 its expectation is unaltered if hi, is replaced by E(h;-1 | u;). 
Thus the right member of the last equality equals 


Efdw;E(hi-st; | us)]) = E(dywhiati) = E(dwAhiat). 


We conclude with two examples: 

I. BINOMIAL AND POISSON DISTRIBUTIONS. Suppose 21, 22, °°: are inde- 
pendent with identical distributions, either binomial or Poisson, with parameter 
6. Thent = 2; (= é¢; for all 2) is an unbiased estimate for 6, and it is well known 
that u; = 21 + --- + 2;is a sufficient statistic for estimating 6 from 2 ,--- , %. 
For any sequential test satisfying (6) our unbiased estimate for 6 will be 

_ Eiat|u =u) _ EWianf) 


~ E(thealu =u) E(hi-rf) 
when n = 7, u; = u, where f i# the characteristic function of the event u; = wu. 
Then 


DL jkj(u, ¢) 


j=l 





° g= for Poisson 


Y kj(u, 4) 
Z k;(u, t) 


v= for binomial 


where k,(u, 7) denotes the number of possible sequences a1, --+ , 2%; for which 
n>i,y+-:- +2; =u,and x;= 7. For the binomial case, this is the estimate 
found in [4]. | 

II. SAMPLES OF CONSTANT SIZE. We consider the special case where a 








110 DAVID BLACKWELL 


sample of constant size N is selected, 7 , --* , » are independent with identical 
distributions, and the density function for z; has the form 


(7) p(x, 6) = r(6)s(6)"™9(x) 


considered by Koopman [6]°. Suppose further that there is an unbiased estimate 
t(x;) for 6. These conditions will be satisfied, for instance, if @ is the mean of a 
binomial, Poisson, or normal distribution, with w(x) = t(x) = xz. Thenwy 
= w(x) + --: + w(xy) is a sufficient statistic. Our estimate v becomes simply 
v = Eft(x) | uv]. Now Eft(x:) | ux] = --- = Elt(aw) | uw], since wy is a sym- 
metric function of x, --- ,2y, which are independent with identical distribu- 
tions. Consequently 


v= B| > tee)/N | us|, 


so that 


a(v) < o (> t)/N) = a t(x,)/N. 


i‘ 

In the special case w(x) = t(x) = x, we have v = > x,/N, i.e. our estimate is 
j=l 

simply the mean of the N observations 2, --+ ,%y. 


REFERENCES 

[1] J. L. Doon, ‘‘The elementary Gaussian processes,’? Annals of—Math. Stat., Vol. 15 
(1944), pp. 229-282. 

[2] J. L. Doon, ‘‘The law of large numbers for continuous stochastic processes,’? Duke 
Math. Jour., Vol. 6 (1940), pp. 290-306. 

[3] J. L. Doon, ‘“‘Stochastic processes with an integral-valued parameter,’’ Trans. Amer. 
Math. Soc., Vol. 44 (1938), pp. 87-150. 

[4] M. A. Grrsuick, FREDERICK Moste.ier, ANbD L. J. Savaase, ‘‘ Unbiased estimates for 
certain binomial sampling problems, with eee," Annals of Math. Stat., 
Vol. 17 (1946), pp. 138-23. 

[5] A. Kormocororr, Grundbegriffe der Wahrscheinlichkeitsrechnung, Ergebnisse der 
Mathematik, Vol. 2 (1933). 

[6] B. O. Koopman, “On distributions admitting a sufficient statistic,’? Trans. Amer. 
Math. Soc., Vol. 39 (1936), pp. 399-409. , 

[7] J. NeymMan, Giornale dell Istituto Italiano degli Attuari, Vol. 6 (1934), pp. 320-334. 

[8] S. Saks, Theory of the Integral, Stechert, 1937. 

[9] A. Waxp, ‘“‘Sequential tests of statistical hypotheses,’’ Annals of Math. Stat., Vol. 16 
(1945), pp. 117-186. 

[10] S.S. Wiiks, Mathematical Statistics, Princeton Univ. Press, 1948. 


5 It has been shown by Koopman [6] that if there is a sufficient statistic satisfying cer- 
tain regularity conditions, the density function for x must be of the form (7). 


oe mmm 5 Ret 








THE DISTRIBUTION OF THE MEAN 
By E. L. WELKER 
University of Illinois 


1. Summary. Both population and sample mean distributions can be repre- 
sented or approximated by Pearson curves if the first four moments of the 
population are finite. Using the a; , 6 chart of Craig [2] to determine the Pearson 
curve type for the population, an analogous &3 , 5 chart is derived for the dis- 
tribution of the mean. This defines a one to one transformation of a3 , 6 into 
a;, 6. The properties of this transformation are used to discuss the approach 
to normality of the distribution of the mean as dictated by the central limit 
theorem. ‘This is facilitated by superposing on the a3 , 6 chart the 4 , 6 charts 
for samples of 2, 5, and 10. 


2. Introduction. For any given distribution function of a population, a 
method is available for finding the distribution function of the mean, when it 
exists, that depends on characteristic functions and the Fourier integral theorem. 
For example, characteristic functions have been used to show that the arithmetic 
means of samples from a normal population is normal, and, with minor restric- 
tions on non-normal populations, that it is asymptotically normal. The method 
depends, of course, on a knowledge of the exact population distribution. 
Some authors have discussed the approximation of the distributions of sample 
means in special cases by one of the Pearson curves. It is the purpose of this 
paper to consider the complete range of Pearson curves as populations to be 
sampled, then to give the sampling distributions of the mean as approximated 
by the Pearson system, and to discuss the manner in which the distribution of 
the mean approaches the normal curve as dictated by the central limit theorem. 
Since the choice of a Pearson curve depends only on moment relationships, this 
will include the approximation of the distribution of the mean for any parent 
population as based on its moments. Both an algebraic and a graphic analysis 
will be given. 


3. Semivariant and moment relationships. Denote by oa, the kth order 
moment of the population with zero mean and unit variance. Let d; be the kth 
order seminvariant of the population. Let a& and , be the same parameters 
of the distribution of <, the mean of a random sample of size N drawn from this 
parent population. Using properties of the seminvariants of linear functions of 
variables independent in the probability sense, formulas relating these param- 
eters [1] are 


Xe = NT, 
=2 x2 -_ aN, 

a‘ = [a* + 3(N — 1)JN™. 
111 


$ 
I 


and 








112 E. L. WELKER 


4. The Pearson system of curves and the distribution of the mean. The 
determination of the Pearson curve will be made in accordance with the scheme 
discussed by C. C. Craig [2]. In this system the curve type is fixed by the 


moment a3 and the constant 
2 
2a, = 3az —_= 6 


am +3 


a 





6 8 10 12 


Fig. 1. The aj , 6 Chart for Pearson’s Curves 


The scheme for determining the type of curve is shown graphically in Fig. 1 in 
which the a3 , 6 plane is divided into areas in which the Pearson curve types are 


noted. The bounding a; , 6 curves are 
6=-l, 6=-3, 6=0, 8=2, a =0, 
a; = 46(6 + 2), and (2+ 38)as = 4(1 + 28)°(2 4+ 8). 


Let 6 denote the value of the 6 function for the distribution of the mean. Then 





j — 2a — 3a — 6 


4+3 ~ 








DISTRIBUTION OF THE MEAN 


In terms of moments of the parent population 


a, + 3(N — 1) 03 bs 

‘ 2[** ey — | ~8N T° _ 2 — Bas - 6 

a, + 3(N — 1) a +3+ 6(N — 1) 

Ee oe Z 

N 

We see that 6 = 6 for N = 1, and 6 < 6for N > 1. Both 6 and & approach 
zero as N approaches infinity. These are the values of the constants for the 
normal function. This result is expected from the central limit theorem. 





oO 


5. The a3, 5 diagram for varying sample size. For every given population 
with finite moments of orders 1 through 4 there exists a Pearson curve represent- 
ing or approximating its distribution. This determines a point in the a3, 6 
plane. For a given sample size, N, there corresponds a point in the a , 5 plane. 
If the point (a , 5) is now plotted on the aj , 6 plane, we can determine the type 
of Pearson curve which is needed to approximate the distribution of xz. The 
transformation of a3 , 6 into & , 6 enables us to analyze the relationship between 
population distributions and distributions of . The transforms of the boundary 
curves in the a3 , 6 plane will constitute an a , 5 chart corresponding to the one 
for a; , 6 shown in Fig. 1. In studying the approach to normality of the dis- 
tribution of x, it is illuminating to superimpose this a3 , 5 chart on the ag , 6 chart. 
In order to do this, it is necessary to make certain algebraic changes in the 
equations. 

First eliminate a, from the formula for 6 as follows. From 


36 + 30; + 6 


_ 2a, — 3a; — 6 


an waited a 
6 a. we fin Os —s 
Substitute this in the expression for 6. Then 
5 2m — 3a; — 6 6(a3 + 4) 








a t38+6N—1) & +44+2N — DQ—S)’ 
This formula, in conjunction with 
2 =3 
az = Na; 


enables us to write the transformations of the boundary curves. 


Boundary Curve Transformed Curve 
— 3 __—(Nas + 4) 
Na; + 4+ 6(N — 1)" 


~ 2(Va + 4) + 10(N — 1)’ 
5 = 0. 














114 E. L. WELKER 
— po. 2(Nas + 4) 
5(Naz + 4) + 16(N — 1) 
os = 466+ 2) a[Nas + 4+ 25(N — 1)P 
= 45(a3 + 4)[5(Na; + 8N — 4) + 2Na; + 8]. 
(2 + 36)a3 = 4(1 + 28)(2 4+ 4) [5(16N + 3Na3 — 4) + 2Nas + 8] 
(Na; + 4+ 25(N — LPNS 
= 4[5(2Na; + 10N — 2) + Na; + 4][5(Naz + 8N — 4) + 2Na5 + 8]. 





5 
4 E 
F Pa 
en et 
oar; ow | 8 
on 2 / | 
2 f 5 gz G 
° / J 
/ of l 
4 # 
4 if | 
ji, Mileecnetivagil D «S 
4 | 
- | 
4 
¥ | 
22 | 
SZ 
\ ~ 
“XN oS ta | 
i te 
i ie i 
y wr a. = 
= C 
“XN 
~ | 
-.6 an | 
75. | - 
tee 
-.8 
-1.0 A 
0 2 4 6 & 10 12 


Fig. 2. The a, 5 and aj, 5 Charts. N = 2 


Fig. 2 shows the chart for distributions of ¢ for N = 2 by dashed curves 
superimposed on the chart for the population shown by the solid curves, and 
Fig. 3 consists of the same curves for N = 5and N = 10. The intervals on the 
population values are 0 < a; < 12and —1 <6 < .4in Fig. 2, but only part 
of the a3 range is shown in Fig. 3. In each case the curves for the distribution 
of ~ cover the interval for a3 , 6 which corresponds to the entire interval shown 
for the population in Fig. 2. Population curves are identified by capital letters 


and the corresponding curves for the distribution of < by the corresponding lower 
case letters. 





















DISTRIBUTION OF THE MEAN 115 


Before discussing the Pearson curve relationships disclosed by these graphs, 
let us analyze some of the geometric properties of the transformation itself. 
Let N be considered as the parameter defining families of curves in the a, 6 
plane corresponding to a; = constant and 6 = constant, the systems of lines 
parallel to the coordinate axes. The transform of a = kis a = k/N,asystem 
of lines perpendicular to 6 = 0, and approaching &; = 0 with increasing N at 
the rate kN~*. The line a3 = 0 is invariant under the transformation, but it is 
not pointwise invariant. 





Fig. 3. The a, 6 and a}, 6 Charts 


The transform of 6 = C is 
Pe C(Na; + 4) a 
Naf + 4+ 2(N — 1)(2 —C) 
Solving for a3 , this becomes 
2 _ 4C — 4 + 20N — 12 - 
” NG — C) 


Except for the straight line 6 = 0, obtained when C = 0, this is a system of 
rectangular hyperbolas with asymptotes 


_ Cs + 4/N) 
aj + [4 + 2N — 12 — CyN> 





,or 6 = 











116 E. L. WELKER 


a; = —[4+ 2(N —1)(2—C)]N” and 
We are concerned only with the range a > 0. Hence 
—[4 + 2.N — 1)(2 —O)|N* 
must be positive for the asymptote to show on the diagram. Since | 6 | < 2, 
and thus | C | < 2, the expression in brackets is necessarily positive. Hence 
the vertical asymptote is always outside the range of interest and will not show 
on the diagram. However the horizontal asymptotes, 6 = C, do appear in all 


cases. The hyperbolas are concave downward if C > 0 and are concave upward 
if C < 0. 


Lines of the pencil 6 = maj are transformed into the hyperbolas 
mas(Nas + 4) 
aj — 2maz(N — 1) + 4 
for N > 1. It is clear that (0, 0) is the only invariant point. Every point on 


= ma; is transformed into a point closer to the origin, the square of the distance 
from the origin changing from 


(m? + 1)as to (m+ l)ajsN™. 
It is easily verified that the hyperbolas are asymptotic to 


o 
I 
Q 


6 = 





™ mN& _W=-D)A+2m) oy ge —4. 

1 —2m(N — 1) [1 — 2m(N — 1)P * "1 = 2m(N — 1)' 
As N approaches infinity, these asymptotes approach 
§= “| and 4@; = 0. 


An area in quadrant one (four) in the a3 , 6 plane is transformed into an area in 
quadrant one (four) in the a, 5 plane. The transformed area is nearer the 
origin. 


6. Types of Pearson curves for distribution of sample means. Examination 
of the graphs in conjunction with the above described properties of the trans- 
formation shows the following facts regarding the distribution of means of 
samples drawn from populations identified by a3 and 6. First consider the 
normal function and the three main Pearson types only. 


Parent Population ' Distribution of Sample Means 
Normal Normal 
I, I, 
I; I, and I, 
‘. Ip , Land I, 
IV IV 
Vit VI, and IV 


Viz VI, , Vi. and IV. 








DISTRIBUTION OF THE MEAN 117 


The transition types were disregarded completely in the above analysis. It is 
worth noting that, disregarding type X, III is transformed into III, VII into 
VII, Iv into II, , never into IIy , V into IV, but never into V. Type X is 
transformed into type III, never into X. Others follow a similar pattern. 

These moment relationships on the distribution of the mean are not sufficient 
conditions in general. In special cases they are, for example the normal dis- 
tribution and the type III (see [3]). They do represent the best approximation 
curve as specified by the Pearson system. We know that in some cases, for 
example type II (see [3]), the distribution of means is not described by a Pearson 
curve. It is clear, however, that the approach to normality is indicated ana- 
lytically by the transformation a3 , 5 to & , 5 and is shown graphically by the 
a, 5 diagram. Skewness and kurtosis in the parent population are reflected 
in the distribution of the mean in small samples. A symmetric distribution 
of the mean requires a symmetric parent population regardless of sample size, 
but the degree of skewness decreases rapidly with an increasing number in the 
sample. The Pearson curve which approximates the distribution of < from a 
bell-shaped parent population is also bell-shaped. The Pearson curve approxi- 
mating the distribution of @ for samples of N = 10 (Fig. 3) is bell-shaped for 
any parent population with values of a3 and 6 within the intervals considered. 
For samples of 5 in the same range the approximating curve is either bell-shaped 
or J-shaped, but it is never U-shaped. For samples of 2, even the U-shaped 
distribution is possible, but only with extreme values of a3 and 6. The point in 
the a; , 5 plane corresponding to the normal curve is the only invariant point in 
the transformation. Hence parent populations with parameters not satisfying 
a; = 6 = O cannot yield normal distributions of sample means. 


REFERENCES 

[1] T. N. Tu1exe, ‘‘The theory of observations’’, Annals of Math. Stat., Vol. 2 (1931), p. 206. 

[2] C.C. Crate, ‘‘A new exposition and chart for the Pearson system of frequency curves’’, 
Annals of Math. Stat., Vol. 7 (1936), pp. 16-28. 

[3] J. O. Irwin, ‘On the frequency distribution of the means of samples from a population 

having any law of frequency with finite moments, with special reference to 

Pearson’s type II’’, Biometrika, Vol. 19 (1927), pp. 225-239. 








NOTES 


This section is devoted to brief research and expasitory articles on methodology 
and other short items. 
ee 2 


ON THE STUDENTIZATION OF SEVERAL VARIANCES 


By B. L. WEtcH 
University of Leeds, England 


1. Introduction. In a recent paper [1] the author considered the problem 
of eliminating several variances simultaneously from probability statements 
concerning the mean of a normally distributed variable. The general situation 
envisaged was as follows. We supposed that we had an observed quantity y 
which could be assumed to be normally distributed about a population mean 


k 
. . 9 rr 
n With variance o, = = \,o; , where the \; are known positive numbers and the 
i=l 


o; unknown population variances. It was supposed further that the data 
provided estimates s; of the o; based on f; degrees of freedom, and having the 
sampling distributions 


2 ] fi si fis: ve fi 8; 
- pts as = sey, xP {3 eh (4 7 a 


and that these estimates were distributed independently of each other and of y. 
The problem was to make statements about the magnitude of the difference 
y — 7 which would involve explicitly only the observed variances s;. The 
probability of the truth of the statements was also to be entirely independent 
of the population values a; . 

The solution was given implicitly in a formal mathematical expression and a 
general process of developing successive terms in a series expansion was de- 
scribed. In the present communication a slightly different way of reaching this 
development is provided. 

2. General method. If the f; are large enough the ratio 





=? 
Vz Ni 8; 
can be taken to be normally distributed with mean zero and standard deviation 
unity. This suggests that, when the f; are not necessarily large, we might 
approach the matter by seeking some other function 


(3) t= g{si,s2,-*-,8,y — 7} 


which will still be normally distributed with the same mean and standard 
deviation. We shall see that such a function can be found, although the method 
to be followed leads us first to another expression 


118 





STUDENTIZATION OF VARIANCES 119 


(4) y — 1 = K(si,s2,-++, 8, 2) 


which is simply the transposed form of (3). Once we have obtained h we can 
solve out from (4) to obtain z. 
Since the distribution of y is independent of s; we have 


1 — 


Transforming therefore to the new variable x we have for given si 


5 1 f h'(s’, x) \ dh(s’, x) 

2 
p(x\s dx = xp ee ee —— a eee 
(6) sii ) V 2aDd; a: - \ . Dr;07 Ox 








= jls', x, DAvoi} dx (say). 


The unrestricted distribution of x is then obtained by averaging over the joint 
distribution of the s;. In order that x should be a unit normal deviate we must 
therefore have 





“¢ 2 2 2 2 1 —}y2 
a wea) = f+ J lst, 2, 2ns08t TT ted ast} = ae. 


We have to substitute from (1) and (6) into (7) and then choose the function 
h(s’, z) in such a manner that the equation is satisfied whatever may be the 
values of the unknown o;. ‘To evaluate the function by the methods of numeri- 
cal integration is probably impracticable except perhaps in some simple special 
cases. A series development is, however, quite feasible. 

Symbolically we can write 


(8) j {s2, x, (Zdwz)} = e=“F-*P% Flaw, x, DA,07} 


where 0; denotes differentiation with respect to w; and subsequent equation to 
o;. Equation (7) then integrates out to give 


P owen, 2a; ;\~ 9 1 =e 
9 ee. =) f, 2, Tuer] = ——e = 
© emf. 22 tes antl = Joy 
i.e. 
° 1 ed 
(10) Oj{w, x, ZrAioi} = Vin’ jx? (say). 


The operator © must be expanded in powers of 0; before it. can be interpreted 
When this is done we find 





4 42 6 «43 8 44 
iO; aq 969; 919; 
(11) @ = exp{z 2 + 82 a + 22 a + 








40; Sa% 4 at\3 
(12) =1 +22 4 far cid 4 g(a idly g 











120 B. L. WELCH 


Our procedure now is to find successive approximations to h(s’, x). It will 
be convenient to denote by h,(s’, z) an expression which equals A(s’, x) to terms 
of order 1/f;. Further let c,1(s’, x) be a corrective term which when added on 
to h,(s’, x) will give a result correct to terms in 1/f;**. Then to this order we 
shall have from (6) 





_ , h;(w, 2)\ dh-(w, 2) 
> sapllinnd exp 1% "2, oe Sissi %) — £(ZAiWi)Cr41(w, 2) 
Vid G; 2 Set fl ae a 


remembering that the leading term in h(w, x) is x+/Drqu; - 
Hence from (10) we find 


f Lh (w, eo. 2)} ol, = Oh,(w, x) 





1 


@ —=—— ex 
VJ2y07 2 Bye? 

















On 
(14) ' ° leit ay 
—j2? )OCrii\o »Z) 2 — oh 
+ Vie e or LCrpi(o , ah € 
Le. 
O } 422 Cr41(0?, 2) 1h; (w, z)\ dh-(w, &)_ fe? 
me of / Sco? eles 2) + @ exp{—4 7 Sh,02 axe oO - 


Given h, we can therefore proceed directly to c,+, and hence to h,+: . 
3. Application to give terms in 1/f;. It will be sufficient illustration of the 
method, if we show here how to obtain h; from ho. We have from (15) 


0 |} 8 ee | { “i ‘ x = (ZAw;) _ 42? 
16) %. = ae 3 mé 
a6) Ox ‘e VJ dy0° ™ fi _ 2 (Zdj;o;) (Ddx0%) , 


i.e. 
O } 422 €1(07,2) \ (rrjot/fs) 2 ru 
(17) - g te a pt <*! d’ exp <— Vu =0 
dx Vdri0%) ' (2di0%)? "© 
where d now denotes differentiation with respect to u and subsequent equation 
to unity 




















i.€. 








oO —}z2 ci(o, 2}. ” (Edso'/Ss) 1 eo on? — 
“ so aS} = Geet ad tae ah 
_ 1 (2ri0%/fi) 9 d —}2? 3 
- 4 (DrA;:.0 (2do8)" dx mt (e " } 
whence 
1+ *) DAjio%/ ) 
(20) (co, 2) = av >).0° ios jets -—_ = wee fi |. 


l 


STUDENTIZATION OF VARIANCES 12h 


* Hence to the terms in 1/f; we have 








: 3 (1 + 2°) (Zris¢/fi) 
21 —=Ks',x) = 2V>), ak + i |. 
Solving this out for « we obtain to the same order 


r=v E — (1 + v") =. 

4 DAs)” 
where v equals (y — n)/+/ Dd;s2. To order 1/f; we may regard z as a unit normal 
deviate and hence determine the probability level corresponding to the observed 
ratio v. On the other hand if we wish to determine the value of y — » which 
will lie on a given percentage level the expression (21) is the appropriate one 
to use. 

4. Further discussion. The present development is of course basically 
equivalent to that given in the previous paper. Indeed if we integrate (10) or 
(15) out with respect to x we arrive immediately at the formulae which were then 
obtained and which were illustrated by calculating terms to order 1/f;. In 
fact when calculating higher order terms it seems best to do this integration 
before carrying out the operation ©. The object of the present note is really to 
stress the fact that we are simply finding a function of the observations and of 
y — » which is distributed as a unit normal deviate, whatever the values the 
true o; may chance to possess. 

Finally, the remarks following equation (7) above should be somewhat ampli- 
fied. The equation asserts that the distribution of any arbitrary function z, 
defined by (3), is 








(22) 


1 1 A?(s*, x)) dh(s?, x) ni 
(23) p(x) = ff \/ Ondo? je ~ 9 7 a, LI {v(si) dst, 


where h(s , x) is the function obtained by solving out (3) for y — ». On carrying 
out the integrations in (23) we shall in general obtain p(x) as a function of z and 
o;. Our argument is that if h be chosen properly the o; will disappear from 
p(x), and x will appear only in the form of the unit normal probability function. 
To find h(s , x) by a direct process of numerical integration would appear to 
involve in the first instance the choice of a net-work of points for x and s;. 
Suppose the range of x is covered by n, points and the range of s; by n; points. 
We may then as an approximation look on our task as that of finding the (nz7,n,) 
values of h(s?, x) corresponding to this network. Since (23) is to be true for all z 
and o;, we can take in turn 7; values of o;, and then (23) can be replaced by 
(nzrn;) simultaneous equations (it would be necessary to use some formula 
expressing dh(s’, «)/dx in terms of values of h(s*, x) at discrete values of x or 
conceivably this may be avoided if we work with the integrated form). With 
a proper choice of the points for x, s;, and o;, we might expect to evaluate the 
series h(s’, x) to any required degree of accuracy, but clearly as a general process 
to be used over a whole range of values f; this approach would be too laborious. 











122 FELIX CERNUSCHI AND LOUIS CASTAGNETTO 


It may indeed be queried whether theoretically, with an indefinitely fine 
network of points, we shall be led to a unique function h(s’, x) with the common 
sense properties, which, from general statistical considerations, we know it 
should have in order to be acceptable. As with integral equations of a simpler 
character, the passage from a discrete network to a continuum may raise prob- 
lems, but it is the author’s opinion that the infinite ranges of x and sj; give us the 
freedom which we require in the solution. 

The author, however, prefers to approach the problem from the numerical 
behavior of the series, of which (15) gives the general terms. Here the practical 
issue appears to be to investigate the relation between the magnitude of the last 
term retained and the f;. The author hopes in a further paper to give some 
results of an investigation of this character and also some tables facilitating the 
calculation of h(s’, x). 


REFERENCE 


[1] B. L. Wexcn, “The generalization of “Student’s’ problem when several different pop- 
ulation variances are involved’’. Biometrika, Vol. 34 (1947), pp. 28-35. 


ener nen RR Be 


PROBABILITY SCHEMES WITH CONTAGION IN SPACE AND TIME’ 
By Féurx Cernuscur’ ann Louis CASTAGNETTO 


Harvard University 


1. Summary. In many natural assemblies of elements, the probability of 
an event for a given element depends not only on the intrinsic nature of that 
particular element, but also on the states of some or all of the rest of the elements 
belonging to the same assembly. On the basis of this general idea of ‘‘contagion” 
some urn schemes are developed in this paper in which one has contagious 
influence in space and time. The most interesting result found is that in general 
the points of convergence of the probability of the assembly are given by some 
of the roots of an equation p = f(p) and that some of these roots, between zero 
and one, represent stable states of the assembly, or points of convergence, and 
others represent unstable ones, or points of divergence. The two neighboring 
roots, (if they are single), of a root representing a point of convergence are un- 
stable values of the probability. Consequently, under certain conditions, the 
limiting probability may be made to have a finite jump by changing the initial 
probability by an arbitrarily small amount. The concrete cases developed in 
this paper can be considerably extended by similar methods by assuming more 
complicated and general assemblies and laws of contagion. 


1On the suggestion of the referee, some parts of the original paper were deleted and 
some mathematical simplifications were introduced. 
2 Research Associate at Harvard Astronomical Observatory and Guggenheim Fellow. 


ae ct 


CC Se VY Ww f= |] ww 


\e 


wae Ee 


PROBABILITY SCHEMES 123 


2. Introduction. In the known probability schemes of contagion of Eggen- 
berger and Polya [1], Greenwood and Yule [2], Liiders [3], Neyman [4], Feller [5] 
and others [6], as well as in Markoff chains different ways are considered in 
which the previous results in a definite series of trials may influence the proba- 
bilities of the future ones. All of these schemes consider possible influences of 
the results of the different trials along the time axis; and consequently might 
be called schemes of contagion in one dimension and one direction. 

In many natural assemblies of individuals or elements, the probability of an 
event per individual or element depends not only on the intrinsic nature of the 
considered element but also on the states of the rest of the elements belonging 
to the same assembly. 

The purpose of this paper is to develop some simple schemes with urns in 
which there is a contagious influence in space and time and to show some of their 
consequences. The method which we have used to treat certain concrete cases 
could be applied to more complicated assemblies and laws of influence in space 
and time. 


3. Scheme of a closed assembly of urns in two dimensions. Let us consider 
a set of N urns arranged on a closed surface in such a way that each one of them 
is surrounded by m others. Let each urn contain a finite number of black and 
white balls. In this paper the probability associated with an urn will refer to 
the probability of obtaining a white ball if a single ball is drawn at random from 
the urn. We shall assume that the initial probabilities are equal for all of the 
urns and that the following law of influence holds: When, after a collective 
trial, one finds that the ball drawn from a certain arbitrary urn, taken as the 
central one, is white and that the corresponding results of the m surrounding 
urns give | white and s black balls, one multiplies the probability of obtaining a 
white ball out of the central urn by the factor aj,1a},2; if the ball drawn from the 
central urn were black, without changing the given results of the surrounding 
urns, one multiplies the considered probability by the factor a:203,. Under 
the specified conditions, it is easily seen that the probability of obtaining a white 
ball from a definite urn at the 7 + 1 trial will be, by considering all the possible 
alternatives: 


™m l ° m—j ° m—j . 
qa) ™o mi 2 ji@ =a [Pi(pi aaa)” * (Qi e102)’ F pi qs(Di e,1)” “(Gi 2,2)" 
i= . . 
= f(pi) = pilpiorsr + gians)” + Pigi(Pi oer + Giee22)”, 
where: 
B+qga=1. 


Consequently p; either converges to a root of the equation p = f(p) or tends to 
infinity. Asa probability greater than one or smaller than zero has no meaning, 











124 FELIX CERNUSCHI AND LOUIS CASTAGNETTO 


we have to study the function y = f(p) between zero and one. In (1) we have 
given an implicit form for y = f(p), corresponding to a particular case of influ- 
ence; by changing the law of influence we change the function f(p). In genera] 
one can find graphically the roots of equation p = f(p) by plotting y = f(p) and 
y = p and by determining the intersections of these two lines in the range 
0 < p < 1. Later we shall give the values of these roots for some concrete 
examples. From what we have shown it follows that if, for the considered 
assembly of urns and for especially chosen values of the parameters of inter- 
connection and initial probabilities, the probability tends to some equilibrium 
value, this must be a root of the equation p = f(p). As we shall see later, the 
roots in the range 0 < p < 1 may represent stable or unstable states of the 
assembly. 

Let us consider now a general method for finding the explicit form of the 
function f(p) corresponding to laws of influence similar to the one used by Polya. 

Assume that the trial 7 results in the drawing of / white balls and s black balls 
from the m urns surrounding the central one. Then we add lw, white and 
sb; black balls to the central urn if the result of the central urn was white, and 
lw. white and sb. black balls if it was black. It is easy to show that under these 
conditions the probability in the trial 7 + 1 is related to the probability in 
trial 2 by the following formula: 


1 
Pin = | salle jus — th (pt + gi | dt 


(2) Ot ty=xtguet 
+ (1 — p) | peor | = it th *(p; tf? + att | dt 
0 Ot; tymtomt 
where W; and N; are the number of white balls and the total number of balls, 
respectively, in the central urn before trial 7. Relation (2) permits us to study 
several interesting schemes. It is easy to see that all the possible schemes which 
can be represented by relations of type (2) give only values of the probability 
in the interval zero and one; and consequently we do not need to make the 
restriction in the analysis of the equation p = f(p) that was necessary in the 
previous scheme, represented by equation (1). 
For the case w; = b; = C1 , Ww. = be = C% , We obtain from (2) 


) Ws - M2 Pi 
Ni + me ° 





(3) Diss = Bi ea + (1 — pi) 
If c; = ce, (3) gives 

(4) Pitt = Pi- 

If one takes ¢, = kN; and co = keN; (3) becomes 


Bt AP + 1 — p) TAM Hp.) 


1+ mk, - 1 + mk 


and the equation p = f(p) has, in this case, the roots 0 and 1. 


(5) Disa = Yi 








PROBABILITY SCHEMES 125 


When w: = b2 = kN; and b} = we = Nj, one has to replace t,(0/dt;) by 
(.(0/dt2) in the second term of (2) ; then if we take m = 2, 


te... ( Di pith _ 9p pnd 
O he Tee OUTER Tes Ie 


In particular, if ki = k, = k, one obtains 


(7) Pin = [4kp; — (4k — 1)p; + 2k] = f(pi), 


1 

1 + 2k 
and the solutions of the equation p = f(p) are p = 3} and 1. By considering 
the behavior of y = f(p) one finds that the stable solution is given by the root 3; 
consequently if one starts with any value of 0 < p < 1 the probability tends 
to the limiting value 3. If k, = 0, ke ¥ 0, by simple calculations, one obtains 
from (6) that the solutions of p = f(p), in this case, are zero and one. 

The equation p = f(p), as given by (6), always has the solution 1. In order 
to have the other two roots real, one has to satisfy: 


ki(1 + 2h) (2+h+3h) > 41+ b+ kh) 
[(ka + ke)? + 2(ki — ke) — 4 KG). 


A simple and interesting application of relation (2) is for the case of two urns, 
characterized by m = 1. From (2) we obtain: 


pit kh Di + ks 


(9) pon = (BAHL P) + py (MEE 1) = S(pi) 


where 
= kiN; ’ by = kN; > W= k3N; , be = kiN; . 


The equation p = f(p), as given by (9) has the roots 0 and 1; and one may fix 
the value of the third root by conveniently choosing the values of the parameters. 

Applying (2) for an arbitrary value of m and integrating by parts, it is seen 
that in general the equation p = f(p) is of degree m + 2 and consequently, by 
choosing appropriate values for the parameters k, , ke , ks , ky , each of which may 
be between —1 and o, one can expect several roots in the range 0 < p < 1. 
One can easily generalize our relation (2) for cases in which w; , we, bi , be are 
given functions of the probability p;. Even in this most general case it is simple 
to see that one would have a recursion formula of the type pii1 = f(p;) and, as 
in the elementary cases which we have considered, the points of equilibrium of 
the closed assembly of urns will be given by those solutions, in the range 0 < p 
< 1, of the equation p = f(p), where the derivative of y = f(p) is negative. 
Consequently the two neighboring roots, if they are single, of a root representing a 
point of convergence are unstable values of the probability. Therefore, under 
certain conditions, the limiting probability may take a finite jump if the initial 
probability is changed by an arbitrarily small amount. This is, we think, the 
most important consequence of the contagion schemes that we propose. We 











126 FELIX CERNUSCHI AND LOUIS CASTAGNETTO 


consider that many actual cases of contagion could be better understood by 
schemes of the type that we are studying. 
Let us consider now some simple cases of relation (1). If we take 


O41 = %2= a M2 = 1 =a and m= 2, 
representing a closed ring of urns, one obtains: 


(10) Dizi = Dilarp; + ogi)” + pigi(arp; + ong;)” 
= pit (pi — pi) [(ar + a)? — 4 ai] = f(p). 


The equation p = f(p), corresponding to this recursion formula, always has the 
solution p = 0. The other two solutions are given by 


“ / i ete 
(11) Pi. = a [1 +a Ts oe + *,|- 


These roots will be between 0 and 1 when 


2< a, + am 2> a1 + a 
(12) or (12’) 7 
1> a < a 1 <a> a 


We would have P; > 0 and P; < Oif 


2<a+ a 2> a + a 
(13) or (13’) 

1l<ay< am 1>a> a, 
and P; = Ps. when 
(14) Q + ag = 2; ay ra 1. 


Let us now study the general behavior of (10). For the conditions (12’) we 
have: 


(15) Pixs — pi = app; — Pr) (pi — P2) 
where a = 47 — (a + @) > 0. 
If 0 < P; < Ps, one obtains from (15) by use of elementary algebra: 


(16) Po | = ap; |(P2— pi)i S si $1. 
Consequently if p, > Pe: the sequence p; increases monotonically. Otherwise 
pi+1 Will lie between P; and p; and will tend to P; without ever reaching the other 
side of this point. In a similar way it is possible to prove the convergence to a 
constant for the most general equations of the type p = f(p) when they have 
roots between zero and one. 

Let us give some numerical results. For a, = 0.95 and a, = 1.1, from (10) 
one obtains: P; = 0.1 and P,; = 0.9. It is easily seen that, in this case, if 


FITTING CURVES 127 


0 <p: < 0.1, the limiting value of p; will be zero; if p,; > 0.1, the limiting value 
will be 0.9. The interesting point is that if the initial probability is in the 
neighborhood of 0.1, an infinitesimal change in its value may produce a finite 
change in the stable limiting probabilities; and that for the initial probability 
equal 0.1 one would have an unstable equilibrium of the system. This con- 
sideration shows why it is important to know how the probability p; converges 
towards a certain point. As we have previously shown, the points of con- 
vergence are roots of the eq. p = f(p) but there roots which are not points of 
convergence. 

Similar reasoning could be applied to more complicated systems belonging to 
our general scheme of contagion. Consequently, the most important result is 
not that the considered assembly may have a probability tending to some value 
in the range 0 < p <1, but that under certain conditions the limiting probability 
may jump from one value to another by changing the initial probability by an 
arbitrarily small amount. 


REFERENCES 


[1] F. EaGENBERGER AND G. Poy, Zeits. fur Ang. Math. und Mech. Vol. 3 (1923), p. 279. 
[2] M. GREENWOOD AND G. U. Yutsz, Roy. Stat. Soc. Jour., Vol. 83 (1920), p. 255. 

[3] R. Lupers, Biometrika, Vol. 26 (1934), p. 108. 

[4] J. Neyman, Annals of Math. Stat., Vol. 10 (1939), p. 35. 

[5] W. Fetuer, Annals of Math. Stat., Vol. 14 (1943) p. 389. 

[6] F. Cernuscui AND E. Sates, Anales Soc. Cientifica Argentina, Vol. 138 (1944), p. 201. 


(me ell ne a 


FITTING CURVES WITH ZERO OR INFINITE END POINTS 


By EpMuND PINNEY 
Oregon State College 


The problem of determining a suitable equation to fit an empirically deter- 
mined curve Over a given interval has been of great importance in statistical 
work, in experimental science, and in engineering technology. Since infinitely 
many types of equations may be made to fit the data with required accuracy, 
the choice of a ‘‘suitable’”’ type of equation depends on the qualitative nature 
of the empirical curve, on the use to which the equation is to be put, and upon 
considerations of simplicity. 

As a function type, the polynomial has, because of its simplicity, been enor- 
mously useful. The function type studied here is a little more general than the 
polynomial type, being particularly useful in the case of empirical curves that 
become zero or infinity at one or both ends of the interval. 

Without loss of generality the interval in which the equation is to fit the curve 
may be taken asO < x < 1. It is assumed that, by numerical means or other- 


wise, a finite set of moment um -[ yx" dx may be computed, y being the 


ordinate of the empirical curve. 











128 EDMUND PINNEY 


The problem to be considered here is that of determining a function f(z) o 
the form 


(Q) ff) =a" — 2 Daye’, Bla) > -1, RB) > —1 


such that 
1 

(2) I f(a)z™ dz = pe 
0 


as m ranges from zero to the number of the highest moment known. f(z) is 
then an approximation to y which may be written 


(3) y © f(x). 


THEOREM 1°. Given a finite set of moments uo , M1, Me, *** 5 Mn, and given that 
R(a) > —1, R(8) > —1, define 








: Tip tat) [sje eteset ‘ 
(4) S,(a, 8) = T(p +2 pa ~*~ B > 1)4 .. T(m+a+1) (— ) Mm, 
(n) (—)' 
™ oe IPR +a + 1) 
yo, eter e+ W@tktets+ gy, » 
; (p — k)!T(p +641) ne 


(6) f(x) = x*(1 — x)? e. as” xt. 


Then f(x) will satisfy (2) form = 0,1, --- 
2°. If, in addition to 1°, un4iis known and a and B satisfy 


(7) Srila, B) = 0, 
then f(x) will satisfy (2) form = n + 1 also. 

3°. If, in addition to 1° and 2°, un42 ts also known, and if a, B also satisfy 
(8) Sni2(a, B) = 0, 


then f(x) will satisfy (2) form = n + 2 as well. 
Proor. Let PS*'”(z) be the Jacobi polynomial of order m defined in terms 
of the hypergeometric function by 


9) Ps) =("F2) Fm mtatstiatiss — b) 


Let P‘*'(1 — 2u) symbolically represent the expression gotten by substituting 
px for 2, in the expansion of the polynomial P{*"” (1 — 2x). There exist numbers 
Am,q Such that 


m 


(10) 2” = >), Ame Pi*(1 — 22). 
0 


it 


1S 


g 


rs 


FITTING CURVES 129 


' Also 


(11) tm = Lig Ama Ps (L — Qn). 
For R(a) > —1, R(8) > —1, define 


_ par oe Sr (2p +a+6+4+1)pll(p+a+e6+1) 
aL — tie ries 


x PSP — Qu)PS(1 — 2z). 


(12) 


Then by (10), for m = 0,1,--- ,n 


? 


1 m 
a (Qp+a+6+1)pll(p+a+p+1) 
[ s22 dc = 2p (pp ta+lrp+e+)) 


m 1 
> [ 2°(1 — 2 P( — 22)P*(1 — 22) de. 
0 0 


PS — 2y) 


By the orthogonality of the Jacobi polynomials, [1; §4.3], 
[ se@2" dx = a An,pPy "(1 — Qu). 
By (11), 
[ see)2" ax = tm, (m = 0,1, ---, 7). 


It follows from (2) that f(x) as defined in (12) is the f(z) of (1). It remains to be 
shown that (12) may be expressed in the form (4)-(6). 
From (9), 


8 — 9,  F@tatl) 
7 ee Tip ta+6+1) 


1 
™ x> [-7" Tp+tm+at+6+1) m 
7 m!(p — m)! T(m + a + 1) : 
so by (4), 
(14) PS PL — 2u) = 5 Sola, 6). 
Inserting (13) and (14) into (12), 
_ an wee Dtet+s+1 
fie) = 2° — 2! Le TTR EI) 
f (-—)!¥ Tpt+kt+a+6+1) , 
” 2 iG - t¢et0° “"** 


a >> (—)* 2" 
“ees os kITk tat 
xb, Setet es er eretetys 


(p — k)!T(p + 8+ 1) (a, 8), 











130 EDMUND PINNEY 
n 
= o%(1 — 2)? Di aga, 


by (5), so the f(x) of (12) may be expressed in the form (4)—(6), and part 1° of 
the theorem is established. 

If (7) holds, by (5), af"*” = af” fork = 0,1,---,n, anda Sj{” =0. There 
fore, in (6), 


n+l 
f(z) = x*(1 — 2)? Di af"* 2", 
0 


and by part 1°, for the case in which v is replaced by n + 1, it follows that (2) 
holds for m = n + 1, so part 2° is established. The establishment of part 3° 
is essentially the same. 

In applying this theorem to the problem of empirical curve fitting, it follows 
from (6) that the constants a and 6 should differ from zero only if the empirical 
curve approaches zero or infinity at one or both of its endpoints. With this 
in mind the following rules may be stated: 

Case A. If, in the empirical curve, f(0) ~ 0 or ~, and f(1) ¥ O or o, set 
a = B = 0, and let n be one less than the number of moments that it is desired 
to fit. 

Case B. Iff(0) = Oor ~ andf(1) ¥ Oor ~, set 8 = 0 and determine a from 
(7), n being two less than the number of moments that it is desired to fit. 

Case C. If f(0) # Oor © and f(l) = Oor ~, set a = O and determine 8 
from (7), n being two less than the number of moments that it is desired to fit. 

Case D. If f(0) = Oor © and f(1) = Oor ~, determine both a and 6 from 
the two equations (7) and (8), n being three less than the number of moments 
that it is desired to fit. 

It may happen that these processes cannot be carried out, or at least cannot be 
conveniently carried out. If this is the case, a or 8 may be set arbitrarily and n 
taken as one unit higher than before, or both a and 8 may be set, and n taken 
as two units higher than before. 

In Case D, above, the solution of equations (7) and (8) may often prove 
difficult, making it advisable to follow the suggestions of the last paragraph. 
In certain special cases, however, their solution is not difficult. 

Suppose, for example, the moments satisfied the equations 


(15) km = Xe (") (—}" me: == 0, Ry Pagar 


If this is substituted into (4), and the order of summation reversed, on making 
use of the identity 


~ n\Tie+e), wp. ;- T(a)(a — »v + 1) 
(16) &(") Pets rn ’ fa~s-0t Gee’ 





one obtains 


(17) Sp(a, 8) = (—)?S,(8, a). 


Ve 


SEQUENTIAL BINOMIAL ESTIMATES 131 


Therefore 
(18) Sopsi(a, a) = 0. 


When n is an integer, either n + 1 or n + 2is odd. Therefore when (15) 
holds, one of either (7) or (8) will be satisfied identically if we take 8 = a. The 
other may then be solved for a. 

As an example, suppose one had the moments wp = 1, uw. = 4, we = oe, us = ¥e, 
us = #2, and wished to obtain an f(x) such that f(0) = 0, f(1) = 0. In this 
case n = 2, and (15) is satisfied. It follows that (7) is satisfied identically when 
8 = a, and (8) gives 


re + 5), 4 Ta + 6) 4 46 P@e+7) 





Tia+1) © © Ta + 2) pei 
4 Pe + 8) 4 T'(2a + 9) 0 
*Te+4) + 4) “i T(a + 5) , 
This easily reduces to 
(ar eee 
_ 5 @+8/2a+ 7/2) , 31 (a + 5/2)(a+7/2) _ 9 


~(a@ti(a+2) * 240° (a+ i(a+2)~ 

which reduces to the quadratic 
4a’ — 62 + 5 = 0, 
from which 
(19) a= 6 = 3/4 + (1/4)V11%. 
These may be substituted into (4)—(6) to complete the solution. 
REFERENCE 
[1] G. SzEG6, Orthogonal Polynomials, Amer. Math. Soc. Colloquium Pub., No. 23, 1939. 
EE  ——— 
CONSISTENCY OF SEQUENTIAL BINOMIAL ESTIMATES 


By J. WoLFow!Tz 
Columbia University 


The notion of consistency of an estimate, introduced by R. A. Fisher, applies 
to a sequence of estimates which converge stochastically, with boundlessly 
increasing sample size, to the parameter (or parameters) being estimated. Each 
estimate is a function of a sample of observations, the number in each sample 
being determined independently of the observations themselves. In sequential 
estimation, on the other hand, the number of observations is itself a chance 











132 J. WOLFOWITZ 


variable, determined by the sequence of observations and the application to 
them of a rule which may be part of a sequential test. In what follows we 
shall consider that the operation of sequential estimation 7s associated with a 
sequential test.’ 

The advantage of using consistent estimates is such as to suggest extension 
of the idea of consistency to sequential estimation. In the present paper we 
shall be concerned only with the estimation of a binomial probability (p, say). 
The obvious extension is that a sequence of estimates, each with its associated 
test, is consistent if the estimates converge stochastically to p. 

Since the number of observations required by a sequential test is a chance 
variable, a parallel to the classical sequence of samples of increasing size would 
be a sequence of sequential tests whose average (in some sense) sample sizes 
increase without limit. It seems reasonable to associate only such a sequence 
of estimates with this sequence of tests as will converge stochastically to p, 
i.e., be consistent. 

Let z be a chance variable which takes the distinct values c; and cz with proba- 
bilities p,0 < p < 1, and q = 1 — p, respectively. Letz,--- ,2z, be a sequence 
of independent observations on z which terminates with the nth according to the 
specific sequential test under consideration. Denote by xz and y, respectively, 
the number of observations c, and ¢; in this sequence. Thenz,yandn =xz+y 
are all chance variables. The couple g = (z, y) is called a boundary point of 
index n (see [1]). The sequence of observations which terminates at g is called a 
path. Let k(g) denote the number of paths which terminate at g, and let k*(g) 
denote the number of these paths whose first observation is c¢,. The “points” 
on the various paths together with all the points g constitute the “region” under 
discussion. 

Let P{n = j} denote the probability of the relation in braces. If 


Pin = j} =1, 


the region is called closed. Only closed regions will be considered below, so that 
this assumption will henceforth be made without explicit formulation. It has 
been shown by Girshick, Mosteller, and Savage [1], that p(g) = k*(g)/k(q) 
is an unbiased estimate of p for any closed region R, i.e., 


= p(g)k(g)p’a" = p, 


where the summation takes place over all the boundary points g of R. For 
many important regions this estimate is the unique unbiased estimate. 

Let there be given an infinite sequence of sequential tests with each of which 
we associate the estimate p(g). Consider the 7th one of these, and let mo; be 
the smallest number of observations required for a decision, i.e., no; is the smallest 


1 Really all that is required is a rule for terminating the observations such that its region 
R is closed (see below). However, we defer to conventional statistical usage in referring 
to ‘‘tests.”’ 


SEQUENTIAL BINOMIAL ESTIMATES 133 


* value of j for which P{n = 7} +0. The theorem proved below asserts that if 
mo; approaches infinity with 7 the estimate p(g) converges stochastically to p. 
To put it in other words: if T; , Tz, --+ is the sequence of tests, and «, and €, 
are arbitrarily small positive numbers, there exists a positive number J(e; , €) 
such that, for all 7; such that 7 > J, 


P{| pg) —p| >a} <e, 


when %; — «©. An important example of such a sequence is that of the Wald 
sequential binomial tests [2] obtained as follows: Let a1, a2,---,a;--- and 
B:, Be, °** »Bi +++ , be two sequences of positive numbers all of which are less 
than 3 and which approach zero asi—> «. Let po and 1,0 < po < pm < 1, 
be two fixed numbers, 





Pi (1 — px) "7 - 
c, = log —, co = lo —, Z;= Zk. 
S Do "i= a) ; 2; * 
Finally let the rule for terminating the process of drawing observations be as 
follows for the 2th test 7’; : The process of drawing observations terminates at 
the smallest integer n for which either 
Zn => log —- or Zn< log 5 Bi 


aj — Gs 





Since (1 — 8;)/a; — ~ and B;/(1 — a;) — 0 while c and c, are constant, it is 
evident that the hypothesis of the theorem is satisfied. 

The property of being unbiased is not generally considered an indispensable 
characteristic of an optimum estimate, while consistency is generally so regarded. 
Our theorem shows that p(g) enjoys the latter property with respect to important 
sequences of sequential tests. 

THEOREM: Let T,,---,7T;:,--: be a sequence of sequential binomial tests. 
For the ith test T; let no; be the smallest integer such that P{n = noi} + 0. Finally 
let; > © ast— «x. Then p(g) converges stochastically to pasi— ~. 

Proor: For typographic simplicity we shall use np as the designation of the 
generic element of the sequence 1, %2,-°-::. No confusion will be caused 
thereby. 

Let n’ = no — 1, and 6; > O and & > O be arbitrarily small fixed numbers. 
Let k’(g) be the number of paths which end at the point g and are such that 
iy’/n’ — p| < 6, where y’ is the number of observations c,; among the first n’ 
observations. We then have 

LemMA 1. For no sufficiently large 
(1) Dk’ (g)p’g > 1 — 6s 


geB 
where B is the set of boundary points of R. 

Proor: Consider the totality {h} of all points h = (2’, y’), with 2’ + y’ = n’. 
Here x’ and y’ denote, respectively, the number of observations c. and c; in the 
sequence of the first n’ observations on z. Let ko(h) denote the number of paths 











134 J. WOLFOWITZ 


toh. Let C denote the set of points h such that | y’/n’ — p| <&. If mis 
large enough we have, by the law of large numbers, 

Z. ko(h)p™ gq” >1- bo. 

hec 


Let k(h, g) be the number of paths from h to g. From Theorem 2’ of [3] it 
follows that 


(A) a k(h, 9)p"q" = pq". 
Also from the definitions of the various symbols involved it readily follows that 
k'(g) = dX ko(h)k(h, 9). 


Hence 
> k'(g)p’g? = a (2 ko(h)k(h, g))p’g® = x (a ko(h)k(h, g)p%q") 
= 2. ho(h)(2, kth, g)p"a") = Dy holhyp"'a" > 1 — &, 


This proves Lemma 1. 

Let €(g) = [k(g) — k’(g)|k(g). Thus &(g) is a chance variable, being a function 
of the chance point g. 

Lemma 2. Let 63 and 6, be arbitrarily small positive numbers. For no sufficiently 
large 


(2) P{&(g) S 633 > 1 — &. 
Proor: If (2) were not true, we would have 
BQ) _ 5 prep\nupt - 
(3) st k'(g)p’¢? < (l — 6&4) + (1 — 63)5, = 1 — 8364. 


Choose the 6. of Lemma 1 so that 6 < 636,. For some large value of m we 
would then have a contradiction between (1) and (3). This proves the lemma. 

Let g be any boundary point. Consider any path whose y’ is such that 
|y’/n’ — p| < 6, ; let us call such a path one of type T. Consider the terminal 
sequence S of this path, 


S :- os 
A > ng » Snot » ° > &n 


This sequence, together with’g = (x, y), uniquely determines y’. Any permuta- 
tion of y’ elements c; and n’ — y’ = x’ elements c2 may serve as the initial sequence 
of n’ observations of a path which terminates at g and has the terminal sequence 
S. For no boundary point is of index smaller than 7) , so that under permuta- 
tion of the first n’ observations a path remains a path, i.e., the process of taking 
observations will not terminate prematurely as a result of the permuting of the 
elements. Of these permutations a proportion y’/n’ begin with the element ¢. 
We deal in this manner with all the different terminal sequences of the paths of 


at 


we 
1a, 
at 
nal 


SEQUENTIAL BINOMIAL ESTIMATES 135 


type J which end at g. Let k*’(g) be the number of these which begin with c; . 
We obtain 
Lemma 3. For all g such that k'(g) + 0 


|e) 
| k’(g) 
Putting Lemmas 2 and 3 together we have 
LemMA 4. As % — ~, k*’(g)/k(g) converges stochastically to p. 
Now it follows in a manner similar to that of Lemma 2 that, as m — ©, 
k*"(g)/k*(g) converges stochastically to one. This, together with Lemma 4, 
proves the theorem. 


-— 2») < &. 


REFERENCES 


[1] M. A. Girsuick, FrEDERICK MosTE.LLeRr, AND L. J. SavaGeE, ‘“‘Unbiased estimates for 
certain binomial sampling problems, with applications,’’ Annals of Math. Stat., 
Vol. 17 (1946), pp. 13-23. 

[2] A. Wap, “‘Sequential tests of statistical hypotheses,’’ Annals of Math. Stat., Vol. 16 
(1945), pp. 117-186. 

[3] J. Wotrowr7z, ‘‘On sequential binomial estimation,’’ Annals of Math. Stat., Vol. 17 
(1946), pp. 489-492. 





BOOK REVIEWS 


Mathematical “Methods of Statistics. Harald Cramér. Uppsala, Sweden: 
Almqvist and Wiksell, 1945. pp. xvi, 575. (Princeton, N. J.: Princeton 
University Press, 1946. $6.00) 


REVIEWED BY WILL FELLER 
Cornell University 


This book represents a contribution of a novel kind to the statistical literature 
and will render valuable services both as textbook and reference book. Of its 
three parts the first one (134 pages) is entitled Mathematical Introduction and 
develops the necessary formal mathematical tools. The second part (186 pages) 
is devoted to Random Variables and Probability Distributions, that is to say, to a 
chapter of the modern theory of probability. The third, and main, part of the 
book (some 233 pages) is entitled Statistical Inference. Ordinarily these three 
topics would require consultation of three or more books, and these would rarely 
be found on the same shelf. However, the masterly exposition succeeds in creat- 
ing the impression of natural unity and harmony. The ideas are developed with 
elegance and apparent ease as if the line of presentation followed a well explored 
path. The uninitiated will not notice how unconventional the treatment is and 
how the very selection of topics depends on the author’s scientific personality. 

It is hardly necessary to point out that Cramér’s book fills an urgent need. 
The emergence of statistical theory and methodology as an exact science, firmly 
grounded in mathematical probability, is only of recent date. Its rapid develop- 
ment went hand in hand with an extraordinary increase of the number and im- 
portance of its various applications. Under such circumstances there was 
naturally little time for an exposition of the theoretical foundations and ramifi- 
cations. Modern statistical inference has its roots in the classical limit theo- 
rems of probability. Now classical probability used to consist of a bewildering 
collection of special and mutually uncorrelated problems; unified guiding princi- 
ples and methods are a rather new development and have not yet found expression 
in the textbook literature. The original investigations are usually written in an 
exceedingly abstract language and the existing close ties to applications are not 
apparent. Consequently, there is no easy access either to probability or statis- 
tics and it is often difficult to establish whether, or to what extent, various asser- 
tions have actually been proved. The present book therefore closes a serious 
gap in the literature and will greatly facilitate both teaching and research. 

Of the 12 chapters of the Mathematical Introduction 9 are devoted to the theory 
of measure and integration. The antiquated theory of the so-called Riemann 
integral (kept alive by elementary textbooks) considered only point functions 
y = f(x), where the independent variable is a point. The temperature at a 
given point or the velocity at a given moment are typical examples. Many 
mathematical considerations simplify greatly if from the very beginning also set 

136 





BOOK REVIEWS 137 


functions y = F(A) are introduced, where the independent variable is a set. 
Typical examples are mass in mechanics, the amount of heat or of electricity, 
area or wealth of a geographic region, and the probability of events (i.e. sets in 
sample space). The Lebesgue-Stieltjes theory frees the concept of integral from 
artificial devices and reduces it to the natural notion of mean values with respect 
to set functions. In a simile, believed to be due to Lebesgue, the Riemann inte- 
gral corresponds to the procedure of a grocer who computes the day’s receipts 
by actually adding the several amounts in the order as they had come in. The 
Lebesgue procedure imitates the more intelligent grocer who orders his cash in 
piles of notes and coins of equal denomination and counts them. The analogy 
with the customary procedure of computing mathematical expectation is clear. 
The Lebesgue-Stieltjes integral is conceptually simpler than the Riemann integral 
and can be presented in as simple a way with rigor adequate for elementary text- 
books. It has become an indispensable tool in probability, statistics, physics, 
and other applied fields. Since it has, unfortunately, not found its way into 
calculus textbooks, physicists are compelled to use the less flexible notion of the 
Dirac 6-funection, and the formal mathematical apparatus in general becomes 
unnecessarily clumsy. It is a curious anomaly that so many calculus textbooks 
profess to be written with a view to applications and yet completely disregard 
the most obvious practical needs and that the teaching of practical mathematics 
should remain uninfluenced by the great developments of the last fifty years. 

In such circumstances the chapters on integration will be particularly welcome 
to statisticians as probably the only place in the literature where they will find 
easy access to the theory. Of course, this exposition leads far beyond what the 
average statistician will require under ordinary circumstances and beyond the 
necessary prerequisites of the main body of the book. Of the 88 pages roughly 
half can be omitted at first reading in accordance with detailed instructions given 
in the Preface. The remaining half will form a valuable reference book for 
theorems and tools used occasionally in connection with more delicate parts of 
statistical theory. The mathematical introduction contains also a chapter on 
Fourier integrals (characteristic functions), one on matrices and quadratic forms, 
and finally miscellaneous complements such as orthogonal polynomials, Euler’s 
summation formula, beta and gamma functions, etc. 

The title to the second part, Random Variables and Probability Distributions, 
is the same as that of the author’s well-known Cambridge Tract of 1937. Both 
start with a discussion of the foundations along axiomatic lines. The new treat- 
ment does not differ essentially from the old one, but some changes are intro- 
duced which are regrettable in the reviewer’s opinion (in particular axiom 3). 
Otherwise there is practically no overlap between the two expositions. The 1937 
booklet devoted much space to the asymptotic expansions connected with the 
central limit theorem which are due to the author himself. This topic is not 
touched upon in the present book. This is a judicious procedure since the 1937- 
booklet is generally accessible (although at present sold out). Instead we now 
find a detailed study of some univariate distributions such as x?, Student’s t, 











138 BOOK REVIEWS 


Fisher’s z, the Pearson system, etc., none of which were mentioned in the Cam- 
bridge tract. Similarly, there is now a section on correlation and regression, 
and the normal distributions in several variables. ‘The theory of probability is 
developed only to the extent of the formal theory of distribution functions. This 
implies that even so important a notion as stochastic convergence is treated only 
summarily while the strong law of large numbers falls completely outside the 
framework of the book. ‘This is regrettable inasmuch as the strong law is of 
greater importance than the classical weak law (whose fame rests essentially on 
a classical misunderstanding). It should be mentioned that this second part of 
the book contains some 39 well chosen illustrative exercises the solution of which 
is left to the reader. 

In the main part of the book, entitled Statistical Inference, the outer form 
changes inasmuch as the text there is accompanied by numerous practical exam- 
ples. However, the exposition remains mathematical in nature and the main 
emphasis rests on exact formulations; much attention is paid to the establishment 
of the precise conditions of validity of the individual theorems, their logical 
interrelations and their connections with general probability. The expert will 
find many minor and major improvements in formulations and proofs. They are 
too numerous to be listed here. Suffice it to point out, as a typical example, the 
theorem on pp. 426-27 concerning the limiting form of the x? distribution with 
estimated parameters; this theorem appears to be more general than usually 
stated and also the proof seems to be novel. The topics treated in the statistical 
part of the book will be seen from the following list of titles to the chapters. 25. 
Preliminary Notions on Sampling. 26. Statistical Inference (general orienta- 
tion). 27. Characteristics of Sampling Distributions (moments, semi-invariants, 
corrections for grouping, etc.). 28. Asymptotic Properties of Sampling Dis- 
tributions (moments, extreme values, range, etc.). 29. Exact Sampling Distri- 
butions (degrees of freedom, Student, Fisher, correlation and regression coeffi- 
cients, partial and multiple correlations, generalized variance, etc.). 30. Tests 
of Goodness of Fit and Allied Tests (treating mostly applications of x?). 31. 
Tests of Significance for Parameters. 32. Classification of Estimates (sufficient, 
efficient and asymptotically efficient estimates; minimum variance, etc.). 33. 
Methods of Estimation (method of moments, maximum likelihood, x?-minimum 
methods). 34. Confidence Regions. 35. General Theory of Testing Statistical 
Hypotheses. 36. Analysis of Variance. 37. Some Regression Problems. There 
follow tables of the normal distribution, the x? and the t-distributions, and a long 
list of references. 

If an expression of wishes for a second edition were permitted, most statisti- 
cians would probably give first choice to non-parametric and sequential tests. 
It is needless to point out that the latter became public only after completion of 
the Swedish edition of the present book 

Even this short account will show the extremely wide range of topics and 
theories covered in the book, from abstract integration to randomized experi- 
ments. They are all presented with uniform lucidity. The exposition through- 





el EE EE rl CS aC 


-_ 


— ™ i FO OE ee ae Ce 





BOOK REVIEWS 139 


out is formal, and yet inspiring, rigorous and yet never pedantic. It will serve 
as an example worthy of imitation and is an achievement on which the author 
deserves our sincere congratulations. 


(en em 


The Advanced Theory of Statistics. Vols. I and II. Maurice G. Kendall. 
London: C. Griffin and Co., Ltd. Vol. I. Second ed. revised, 1945; pp. xii, 
457, 50 shillings. Vol. II. 1946; pp. viii, 521; 42 shillings. 


REVIEWED BY M. S. BartLerr 
Cambridge University and The University of North Carolina 


With the recent appearance of the second volume, it is now possible to review 
as one work this comprehensive treatise. To quote the author’s opening re- 
marks to the Preface to Volume I: “The need for a thorough exposition of the 
theory of statistics has been repeatedly emphasized in recent years. The object 
of this book is to develop a systematic treatment of that theory as it exists at the 
present time.” An outline of the contents, which in the two volumes make up 
just on a thousand pages, will indicate that this formidable task has been squarely 
faced by the author, who, when a tentative co-operative venture of writing such 
a treatise was upset by the outbreak of the war, continued alone with the project. 

Volume I contains sixteen chapters. The first six introduce the concept of 
frequency distributions via observational data on groups and aggregates, and 
their mathematical representation (Ch. 1), measures of location and dispersion 
(Ch. 2) and moments and cumulants in general (Ch. 3), characteristic functions 
(Ch. 4), and ending with a description of the standard distribution functions, such 
as the binomial, Poisson, hypergeometric and normal distributions, and the 
Pearson and Gram-Charlier systems. The next section opens with probability 
(Ch. 7) and proceeds to sampling theory (Chs. 8-11), including a chapter (Ch. 10) 
on exact sampling distributions, many of the standard sampling distributions 
being used in this chapter to illustrate the mathematical methods available for 
obtaining sampling distributions. Chapter 11 deals with the general sampling 
theory of cumulants, including a useful reference list of formulae and a demon- 
stration, due to the author, of the validity of Fisher’s combinatorial rules for 
obtaining these formulas. The section concludes with a chapter on the Chi- 
square distribution and some of its applications. The last four chapters of 
Volume I deal with association and contingency, correlation, including partial 
and multiple correlation, and rank correlation; this last chapter being a compre- 
hensive treatment including comparatively recent results of the author. 

It will be convenient to list also the contents of Volume II before any critical 
comment on either volume. The first section of the second volume comprises 
four chapters on the theory of estimation, including a derivation of the properties 
of the maximum likelihood estimate (Ch. 17) and separate chapters on Fisher’s 
theory of fiducial probability and Neyman’s theory of confidence intervals. The 





140 BOOK REVIEWS 


second main section, according to the author’s remarks in the preface to Volume 
II, deals with the theory of statistical tests and comprises chapters 21, 23, 24, 26, 
27 and 28; of these after an introductory chapter (Ch. 21) on tests of significance, 
chapters 23 and 24 cover analysis of variance, chapters 26 and 27 give a fairly 
detailed account of the general theory of significance-tests originated by Neyman 
and Pearson and Chapter 28 deals with the recently developed techniques of 
multivariate analysis. The remaining chapters are 22 on regression, 25 on the 
design of sampling enquiries, and Chapters 29 and 30 on time-series, another 
subject in which the author has himself taken an active interest. Finally, there 
are two appendices, A consisting of a few addenda to Volume I, and B an exten- 
sive bibliography of theoretical statistical papers. 

The volumes are attractively printed; and each chapter concludes with a useful 
collection of examples for the reader. 

In any comprehensive treatment of a wide subject there can be no clearly de- 
fined order of presentation ; nevertheless, the author’s order of chapters in Volume 
II and in particular his inclusion of analysis of variance among the chapters on the 
theory of statistical tests is a little puzzling, and the reviewer’s preference would 
have been to see this important subject treated earlier, together with regression 
analysis, and their link with the classical method of least squares more firmly 
outlined. Incidentally, there appears to be no mention of the Fourier analysis 
of observational data except in its relation to periodogram analysis (Ch. 30). 
This change of order would perhaps also have allowed a shift forward of Chapter 
25 on the design of sampling enquiries, and a more compact section on multiple 
correlation, culminating with the chapter on multivariate analysis before the 
chapters on the general theory of statistical inference were begun. 

Another arrangement of rather doubtful value in Volume II is the allocation of 
separate chapters to fiducial probability and to the theory of confidence inter- 
vals. The problem of how to deal with a field which is still a battleground is 
admittedly not an easy one, and this particular one is an embarrassment at 
present to many teachers, but it may be questioned whether strict impartiality 
is the best answer. To take a hypothetical example, there would seem to be no 
particular virtue in a textbook which expounded, in parallel, statistical methods 
of inference using direct probabilities and the method of “inverse probability”, 
leaving the reader to decide at the end which he should adopt. 

The most criticizable arrangement, however, occurs in Volume I with the late 
and rather scanty treatment of probability in Chapter 7. To begin with ex- 
amples of statistical data is sound, but since the whole conceptual model erected 
to deal with such data is based on probability theory, it does not seem sufficient 
for a reader who “feels keenly on the subject”’ to do as the author suggests in the 
Preface and read Chapters 7 and 8 after Chapter 1. Even if he does so, he will 
find no very clear exposition of the statistical theory of probability,—no mention, 
for example, of the laws of large numbers, whether for simple dichotomies or for 
entire continuous distribution functions, that show how the conceptual model 
adequately corresponds with the empirical notions of “in the long run” or “for 





BOOK REVIEWS 141 


a large enough sample’. The actual arrangement, moreover, leads to an ap- 
parently rather arbitrary treatment of theorems on limiting distributions; 
the First Limit Theorem, which deals with the equivalence of the limits of dis- 
tribution function and corresponding characteristic function sequences, is given 
in the chapter on characteristic functions (Ch. 4), and the Central Limit Theo- 
rem, dealing with the convergence to normality of a sum of n independent random 
variables, is given in the chapter on probability. 

In the proof of the second part of the First Limit Theorem, dealing with the 
conditions under which a sequence ¢,(t) of characteristic functions determine 
the limiting distribution function F (x), the author has not yet corrected an error 
that occurred in Cramér’s original version, which Kendall follows (section 4.12). 
Correct conditions for convergence of the distribution function sequence F(x) to 
F(x) (at all continuity points of /’) are convergence of the characteristic function 
sequence to ¢(t) for all real ¢, uniformly in at least some finite ¢ interval (cf. H. 
Scheffé, Math. Reviews, Vol. 6 (1945), p. 89). 

Another proof in Volume I which appears to need clarification is the geometri- 
cal derivation of the distribution of the multiple correlation coefficient in the case 
of a non-zero true correlation (section 15.21). The blunt statement is made, 
following equation (15.51), that the sample correlation coefficient R and an angle 
y (defined in the text) are independent, a statement which is incorrect. How- 
ever, if the logic of Fisher’s original derivation is examined, it turns out that the 
relation of R and y is only required when the true correlation is zero; under such 
conditions R and y are independent. 

In Volume II there is a sentence requiring correction and amplification in the 
derivation (in the case of zero true canonical correlations) of the sampling canoni- 
cal correlation distribution (section 28.30). The sentence “Consider the dis- 
tribution for a given value of é;; and 2;; ---’? should be corrected to read “Con- 
sider the distribution for a given value of ¢;; + 2;; ---”. Some justification that 
the distribution is independent of ¢;; + z;; is then still needed. 

There is inevitably, owing to the time the book was written, no mention of 
sequential analysis, the sampling technique developed during the war by Wald 
and others and only recently “derestricted’”’. Again, in chapter 18, where the 
work of Aitken and Silverstone on unbiased estimates with minimum variance is 
referred to, the simple inequality connecting the variance of any unbiased esti- 
mate with Fisher’s information function throws an interesting new light on this 
aspect of the estimation problem (see, for example, H. Cramér, Mathematical 
Methods of Statistics, section 32.3, or C. R. Rao, Bulletin Calcutta Math. Soc., 
Vol. 37 (1945), p. 81), but was not known to the author when this chapter was 
written. Such omissions are merely an indication of the developing nature of the 
subject, and it is hoped they can be remedied in later editions. There is, how- 
ever, especially in Volume II, an occasional impression of patchiness in the treat- 
ment not altogether excusable on such grounds. This can perhaps be illustrated 
from the last chapter, a valuable contribution to the still-growing subject of 
time-series, but where the importance of some known results does not always 











142 BOOK REVIEWS 


seem sufficiently stressed; in particular, the Wiener-Khintchine relation between 
the periodogram and correlogram is noted (section 30.68) as “an interesting re- 
lation’’, whereas it is a fundamental relation in the modern method of approach 
to time-series, giving much deeper insight into the correct interpretation of 
classical periodogram analysis. 

These criticisms, which could be extended to cover minor errors and mis- 
prints, are not intended to detract seriously from what is a remarkable achieve- 
ment. An excellent sense of proportion has been maintained throughout be- 
tween mathematical theory and illustrative discussion and examples. This makes 
this treatise, if both the breadth and level of the subject matter are taken into 
account, at present unique. It will be an indispensable reference book to every 
teacher and advanced student of the theory of statistics. 


roe er A AI a 


Sequential Analysis of Statistical Data: Applications. Prepared by the 
Statistical Research Group, Columbia University for the Applied Mathe- 
matics Panel, National Defense Research Committee, Office of Scientific Re- 
search and Development. SRG Report 255, Revised; AMP Report 30.2R, 
Revised. New York: Columbia University Press, September 1945. pp. vii, 
17; iv, 80; v, 57; iii, 25; iii, 18; iii, 39; ti, 41. $6.25. (London: Oxford Uni- 
versity Press, 1946.) ) 


REVIEWED BY JOHN W. TUKEY 
Princeton University 


Many of the features of this compendium are familiar to most of the readers of 
this review, but for the benefit of the others I shall enumerate them briefly. It 
consists of a heavy looseleaf binder containing 7 booklets of distinctive colors— 
each saddle stitched and usable separately. It is the last word (to date) in pre- 
senting sequential analysis to the statistician who may wish to use it in practice. 
It covers five elementary cases (each in a booklet, the two others being used for 
introduction and appendices): 


Acceptance or rejection by percent defective (Sec. 2) 

Comparative percent satisfactory (Double dichotomy) (Sec. 3) 

Acceptance or rejection by the adequacy of the mean (with known variability) 
(Sec. 4) 

Acceptance or rejection by the exact value of the mean (with known variability) 
(Sec. 5) 

Acceptance or rejection by the smallness of the variability (Sec. 6) 


These cases are covered in complete detail, with illustrative examples, tables and 
charts. A copy should be accessible to every teacher of statistics and to every 
statistician in industry or experimental work who can propose new techniques of 
testing. 





| 


BOOK REVIEWS 143 


With this general introduction let us go on and explain what the reader will 
not find and what further work in this line the reviewer awaits with keen interest. 
The classical testing procedure was to test a sample of predetermined size and 
then decide to accept or reject. Long ago curtailed sampling and double samp- 
ling were developed to cut corners legitimately and reduce inspection costs. 
There are two situations, each more frequent in war than in peace, where it is 
clearly desirable to reduce the average number of items tested to a minimum: 


(I) Where essentially all lots are accepted and the test is destructive so that the 
items tested are the main loss of production, or 


(II) Where the cost of testing an item is large in comparison with the cost of 
production. 


Subject to a practically unimportant allowance for the finite size of the lots, and 
to an allowance of unknown importance for the quality of lots presented, the 
methods of sequential analysis minimize this average number among all methods 
so far considered. When situation (I) or (II) holds without modifiying complica- 
tions, then, the best known method is sequential analysis, the natural descendant 
of double sampling. Otherwise, the situation is far from clear, and much judge- 
ment is involved in setting up a practically efficient scheme. The reader will 
get no help on this problem of judgement, nor in the problem of setting risks from 
the book under review—he will get every needed help with the mathematical 
problem of setting up a sequential plan to meet chosen risks, including complete 
tables of all necessary functions, including natural logarithms. 

There is no reason to suppose that sequential analysis is the last word in testing 
procedures for the general problem of efficient testing, but what should be the 
next step ahead is not a step for the mathematical statistician. What is needed 
now is a careful analysis, by the operational research techniques so useful during 
the war, of a half-dozen industrial testing situations to determine what properties 
of the testing procedure are involved in cost and to what extent. Do we want 
the minimum average sample size, the minimum average square of the sample 
size—or what? With this there should go a corresponding operational study of 
the advantages of different OC curves, including those of what now seems to be 
a peculiar shape. Given these studies, we could put the problem in mathematical 
statistics to the mathematical statistician which he would then solve. But with 
the present lack of operational research groups in industry, it is probable that 
we will proceed in an unnatural way, and that the mathematical statistician will 
take the next step forward. For reasons of mathematical simplicity it is not 
unlikely that the sample plan with the minimum average squared sample size 
will come next. 

The credit for the book is clearly assigned on the inside cover of each pamphlet 
in the following words: ‘So many members of the Statistical Research Group 
(Columbia) have participated in the preparation of this report, a previous edi- 
tion of which was prepared by H. A. Freeman, that its authorship is attributed 
to the group as a whole. The responsibility for planning and preparing this 





144 BOOK REVIEWS 


edition has been shared by H. A. Freeman, M. A. Girshick, and W. Allen Wallis, 
with the cooperation of Kenneth J. Arnold, Milton Friedman, Edward Paulson, 
and others. The theory of sequential analysis is mainly the work of A. Wald.” 

It may be of interest to notice a few minor points for the record. On page 
1.01 it is indicated that 100% inspection is 100% effective— this seems far from 
industrial experience. Another badly needed set of operational studies would be 
on the influence of the sampling plan on inspector’s inspection. On page 2.27, 
the footnote suggests that when a tabular procedure is used instead of 
a graphic one, that more decimal places should be kept—the logic of this is not 
clear. On page 4.14 it is stated that “similarly, if all patches had tested 400 
minutes, the experiment would have terminated at 9.4...”. Clearly no such 
experiment can terminate after a fractional number of tests. On page A.09 
it is stated that “Finally it should be mentioned that truncation of any kind 
ought generally to be avoided”. This seems to the reviewer to be a rash state- 
ment, for when not only average sample size but all other properties entering into 
the practical efficiency of a sampling plan are considered, this decision will almost 
certainly be reversed. The relatively small number of these detailed points is 
an evidence of careful and competent workmanship. 

A footnote to the Appendix (B) on some principles of sequential analysis states: 
“Any mathematician who may stray into this Appendix should be assured that 
the validity of the conclusions in no case depends upon the type of reasoning 
presented here; indeed, even for intuitive or heuristic arguments mathematicians 
may prefer those given in SRG 75”. This warning and caveat seems unduly 
strong—the appendix is recommended to all mathematically minded newcomers 
to sequential analysis. 


The same appendix warns the reader in a few places that the theory set forth 
does not allow for the fact that samples come in units. If the reader tries to 
apply the theory to cases far from normal inspection practice, for example with 
risks of 0.25 and average sample sizes of 12, he will then find out that this does 
occasionally make a difference. In conventional circumstances the approxima- 
tion will not bother him. 





NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Mr. Kurt W. Back has accepted a position with the Research Center for Group 
Dynamics, Massachusetts Institute of Technology. 

Mr. Stanley D. Canter was discharged from the Army in October and has been 
enrolled as a graduate student in mathematical statistics at Columbia University. 

Mr. William W. Cooper has accepted a position at Carnegie Institute of Tech- 
nology, Pittsburgh. 

Mr. Robert Dorfman is enrolled as a graduate student in the Department of 
Economics, University of California, Berkeley, and is also serving as a teaching 
assistant in that department. 

Dr. Nicholas Fattu, formerly at Michigan State College, has accepted a teach- 
ing position at Indiana University, Bloomington. 

Mr. John P. Gill is now Chief of the Research and Progress Analysis Division, 
War Assets Administration, Houston Regional Office, Texas. 

Dr. Clausin D. Hadley has accepted a position with the Graduate School of 
Business, Stanford University. 

Mr. Malcolm H. Henry is now Assistant Statistician in the Statistical Depart- 
ment of the Michigan State Department of Social Welfare, Lansing. 

Dr. Alston S. Householder has accepted a position as Principal Physicist with 
the Monsanto Chemical Company, Clinton Laboratories, Oak Ridge, Tennessee. 

Mr. Morton Kramer is now with the Office of International Health Relations, 
U.S. Public Health Service, Washington. 

Mr. F. C. Leone, who was discharged from the military service in the fall, has 
returned to his former posisiton in the Department of Mathematics at Purdue 
University, Lafayette, Indiana. 

Mr. Philip J. McCarthy, formerly at Princeton University, is now at Cornell 
University, Ithaca, New York. 

Mr. Edward C. Molina has been named special lecturer in Mathematics at 
Newark College of Engineering, in addition to Dr. Emil J. Gumbel, previously 
mentioned. 

Mr. Nicholas Pastore has accepted a position in the Department of Mathe- 
matics, City College of New York. 

Dr. William S. Robinson is now Assistant Professor of Sociology and Statistics, 
University of California at Los Angeles. 

Dr. Leonard J. Savage, who has a Special Rockefeller Fellowship, is spending 
the academic year at the Institute of Radiobiology and Biophysics, University of 
Chicago. 











146 NEWS AND NOTICES 


Professor Dunham Jackson died at Minneapolis on November 6, 1946. From 
1919 until 1946 Mr. Jackson was Professor of Mathematics at the University of 
Minnesota, and in 1946 was named Professor Emeritus. 

Professor Charles C. Wagner died suddenly on May 23, 1946, at the age of 52, 
He was acting dean of the College of Liberal Arts of Pennsylvania State College 
when he died. 

_ 

Those interested in the work of the Mathematical Tables Project, will, upon 
request, be placed on the mailing list for copies of the monthly progress reports, 
issued by the Project. Requests should be addressed to Dr. Arnold N. Lowan, 
150 Nassau St., New York, N. Y. 


a ene nn II a 


Statistical Research Laboratory, University of Michigan 


Several developments in instruction and research in the general field of sta- 
tistics are in progress at the University of Michigan. 

At the beginning of the current academic year the new Statistical Research 
Laboratory was opened. It is planned that this unit, which is a division of the 
Graduate School, will serve as the center for research employing statistical me- 
thods and for research in statistical methodology. Free consultation and advice 
on statistical matters are offered to all members of the University engaged in 
research and the latest types of computing machines are available for their use at 
no cost to them. Or the Laboratory will undertake, at fees to cover costs, com- 
puting and the analysis of data for such individuals or units of the University. 
The Laboratory will have available the services of the University’s completely 
equipped Sorting and Tabulating Station and expects to continue to provide a 
center for the most efficient computing service as improved machines are de- 
veloped. The technical assistants employed by the Laboratory will be advanced 
students of statistics who will thus have the opportunity to supplement their 
training with experience with actual statistical investigations. Professor 
C. C. Craig as Director and Professor P.S. Dwyer are in charge of the new labora- 
tory, each on a half-time basis. 

The new Laboratory is a research and not a teaching unit and is distinct from 
the large statistical laboratories for the use of students in statistics courses already 
in existence on the campus. With respect to instruction in theoretical statistics, 
the curriculum in that subject in the Mathematics Department has recently been 
revised and extended to include twenty-four semester hours at the undergraduate 
and graduate levels in addition to courses in probability, finite differences, 
graphical methods, and quality control. The somewhat related professional 
program in actuarial mathematics has likewise been strengthened. The teaching 
staff for these two curricula includes Professors H. C. Carver, A. H. Copeland, 
C. C. Craig, P. S. Dwyer, C. H. Fischer, and C. J. Nesbitt. 

A number of postwar research programs whose pursuit involves the use of 
probability and statistical methods have been established at the University of 
Michigan. Of especial interest is the new Survey Research Center under the 





le 


al 
p! 











NEWS AND NOTICES 147 


leadership of Professors R. Likert and A. A. Campbell who will continue activities 


begun by their group in Washington in the Department of Agriculture. Re- 
search by survey methods in the social sciences for public and private agencies 
and in survey methods themselves will be pursued and in addition a training 
program combining formal courses and apprenticeship in the Center is being 


set up. 
nn 


New Members 


The following persons have been elected to membership in the Institute: 


Albert, George E., Ph.D. (Wisconsin) Head, Mathematics Division, Research Dept., Naval 
Ordnance Plant, Indianapolis, Ind., 1104 N. Oakland Ave. 

Ament, Richard P., B.A. (Cornell) Scientific Aid, 2129 20th St., N., Arlington, Va. 

Bennett, Myra S., (Mrs. C.A.). A.B. (Michigan) Office Mgr., Institute of Math. Stat., 
Rackham Bldg., Ann Arbor, Mich., P.O. Box #3, Saline. 

Blankmeyer, Edith., A.B. (Western College) Stat., Res. Dept., National Broadcasting Co., 
30 Rockefeller Plaza, New York 20, N. Y. 

Blyth, Colin, Jr., M.A. (Queen’s Univ. and Univ. of Toronto) Graduate student, Univ. of 
N. Car., Chapel Hill, N. Car., 209 Mangum Dormitory. 

Brown, Philip, B.S. (Pittsburgh) Stat., R. 329 Standard Oil Bldg., 3rd and Constitution 
Aves., Washington, D. C. 

Bruno, O. P., B.M.E. (New York Univ.) Chief, Methods Section, Ballistic Research Labs., 
Aberdeen Proving Ground, Md. 

Carrier, Norman H., M.A. (Cantab) Civil Servant, Mathematical Statistics Section of 
Chief Scientific Advisers Division, Ministry of Works, c/o Westminster Bank, Palmers’ 
Green, N. 18, London, England. 

Chand, Uttam, M.A. (Punjab Univ., India) Graduate student, Univ. of N. Car., Chapel 
Hill, N. Car., 112 Mangum Dormitory. 

Crow, Edwin L., Ph.D. (Wisconsin) Mathematician, Science Dept., Res., Devel., and Test 
Organization, USNOTS, Inyokern, Calif. 

Dang, Mary., M.A. (California) Graduate student, Columbia University, New York 27, 
N. Y., Box 267, Johnson Hall. 

Ens, Catherine C., B.S. (Dayton) Stat. Res. Ass’t, Graduate School, Ohio State Univer- 
sity, Columbus, Ohio, 267 Fifteenth Ave., Columbus 10. 

Fox, William H., Ph.D. (Indiana) Ass’t Prof. of Educ. and Ass’t Director of Res. and 
Field Service, Indiana Univ., Bloomington, Ind., 729 E. Hunter. 

Geisler, Murray A., M.A. (Columbia) Operations Analyst, Headquarters Army Air 
Forces, 222 N. Piedmont St., Arlington, Va. 

Gershenson, Charles P., B.B.A. (C.C.N.Y.) Res. Assoc., Institute of Psychological Res., 
Box 130, Teachers College, New York 27, N. Y. 

Gilford, Leon, A.B. (Brooklyn) Econ. Analyst, Census Bureau, Washington, D.C., 1410 
19th St., S. E. 

Goudsmit, S., A., Ph.D. (Leyden) Prof. of Physics, Northwestern Univ., Evanston, Ill. 

Halperin, Max, M.S. (Iowa) Graduate student, Univ. of N. Car., Chapel Hill, N. Car., 211 
No. Columbia. 

Halperin, Sidney L., Ph.D. (Ohio State) Psychologist, Neuropsychiatric Institute, Univ. of 
Mich. Hospital, Ann Arbor, Mich., 2401 Pittsfield Blvd., Pittsfield Village. 

Herbach, Leon H., A.B. (Brooklyn) Sub. Instr., Dept. of Math., Brooklyn Coll., N. Y., 
1926 64th St., Brooklyn 4. 

Hoeffding, Wassily, Ph.D. (Berlin) 151 West 88 St., New York 24, N. Y. 

Huhndorff, Roland F., B.S. (St. Mary’s Univ.) Ass’t to Ass’t Chief Chemist, The Texas 

Co., Res. Lab., Port Arthur, Texas. 


148 NEWS AND NOTICES 


James, William C., A.B. (Knox Coll.) Director, Stat. Div., National Safety Council, 29 
N. Wacker Dr., Chicago 6, Ill., 7835 So. Dobson Ave., Chicago 19. 

Lev, Joseph, Ph.D. (Cornell) Ass’t Civil Service Examiner, N. Y. C. Civil Service 
Comm., and Lecturer, Teachers College, Columbia Univ., N. Y., 8550 Forest Parkway, 
Woodhaven 21. 

Linder, Arthur, Ph.D. (Bern) Prof. of applied math. stat., University of Geneva, Switzer. 
land, Avenue de Champel 24. 

Lord, Frederic M., M.A. (Minnesota) Ass’t Director, Graduate Record Examination, 437 
West 59th St., New York 19, N. Y., 153 W. 63rd St. 

Marshall, Herbert, B.A. (Toronto) Dominion Stat., Dominion Bureau of Statistics, 
Ottawa, Canada. 

Meacham, Alan D., Supv., Sorting and Tabulating Station, and Lecturer, School of Bus, 
Adm., Univ. of Mich., Ann Arbor, Mich., 114 Rackham Bldg. 

Miller, Irving, B.S. (C.C.N.Y.) Stat., Bureau of Labor Stat., Washington, D.C., 1900 
Biltmore St., N. W., Washington 9. 

Nanda, D., N., M.A. (Agra, India) Graduate student, Univ. of N. Car., Chapel Hill, N. 
Car., Dept. of Statistics. ‘ 

Pines, Sylvia F., M.A. (Michigan) Instr., Math. and Stat., 43-17 48th St., Long Island 
City 4, N. Y. 

Quastler, Henry M.D. (Vienna) Medical Radiologist, Carle Hospital Clinic, Urbana, IIl., 
612 W. Nevada. 

Reiersol, Olav, Ph.D. (Stockholm) Teacher of stat., Univ. of Oslo, Oslo, Norway, Interna- 
tional House, 500 Riverside Dr., New York 27, N. Y. 

Romanovsky, VsevolodI., Ph.D. (Moscow) Prof.at the Univ. and Member of the Academy 
of Sciences, Tashkend, U. 8.8. R. 

Rust, Charles H., S.J., M.A. (St. Louis) Graduate student, St. Louis Univ., St. Louis, 
Mo., 221 N. Grand Blvd., St. Louis 3. 

Seal, Hilary L., B.Sc. (Univ. Coll., London) Head of Stat. Branch, Room 2, Old Bldg., G., 
Admiralty, Whitehall, London, S. W. 1, England. 

Serbein, Oscar N., Jr.,. M.S. (Iowa) Graduate student, Columbia Univ., New York 27, 
N. Y., Army Hall, Rm. 333H, 1560 Amsterdam Ave., New York 81. 

Sholl, D., A., B.Sc. (London) Stat. in Math. Stat. Section of Chief Scientific Adviser’s 
Div., Ministry of Works, 81 Lynmouth Ave., Bush Hill Park, Enfield, Middlesez, 
England. 

Siegel, Irving H., M.A. (New York) Chief, Economics Div., Veterans Adm., Washington, 
D. C., 407 9th St., N. W., Washington 11. 

Sitgreaves, Rosedith, M.A. (Geo. Washington) Ass’t Stat., U. S. Public Health Service 
(on leave); Graduate student, Columbia Univ., New York 27, N. Y., Johnson Hall, 
411 W. 116th St. 

Tama, Joseph, B.A. (Washington) Pfe. U. S. Army, 5250 TIC; GHQ AFPAC; APO 500, 
c/o Postmaster, San Francisco, Calif. 

Tate, Merle W., Ed.M. (Harvard), M.A. (Montana) Assoc. Prof. of Educ., Hamilton 
Coll., Clinton, N. Y. . 

Thrall, Robert M., Ph.D. (Illinois) Ass’t Prof. of Math., Univ. of Mich., Ann Arbor, 
Mich., 953 Spring St. 

Vaughn, Kenneth W., Ph.D. (Iowa) Director, Graduate Record Examination Office of the 
Carnegie Foundation for the Advancement of Teaching; and, Assoc. Director of Co- 
operative Test Service of Amer. Council on Educ., 437 West 59th St., New York 19, N. Y. 

Wallace, Clifford A., Sup’t of Quality, Camera Works, Eastman Kodak Co., 333 State St., 
Rochester, N. Y. 

Wilkins, J., Ernest, Jr., Ph.D. (Chicago) Mathematician, American Optical Co.,S.I.D., 
Box A, Buffalo 15, N. Y. 

Wilkinson, Roger I., B.S.E.E. (lowa State) Member Technical Staff, Bell Telephone Labs., 
463 West St., New York, N. Y. 





REPORT ON THE BOSTON MEETING OF THE INSTITUTE 


The twenty-fourth meeting of the Institute of Mathematical Statistics was held 
at the Hotel Statler, Boston, Massachusetts, on Saturday, December 28, 1946. 
The meeting was held in conjunction with the One Hundred Thirteenth Annual 
Meeting of the American Association for the Advancement of Science. The 
following 45 members of the Institute attended the meeting: 


K. J. Arnold, M. 8. Bartlett, W. D. Baten, C. I. Bliss, G. W. Brier, G. W. Brown, T. H. 
Brown, B. H. Camp, C. W. Churchman, W. G. Cochran, J. H. Curtiss, D. B. DeLury, P. V. 
Dorweiler, Churchill Eisenhart, Benjamin Epstein, H. A. Freeman, Hilda Geiringer, H. H. 
Germond, J. A. Greenwood, Boyd Harshbarger, W. A. Hendricks, E. H. C. Hildebrandt, 
W. C. Jacob, H. B. Kaitz, L. F. Knudsen, Walter Leighton, A. J. Lotka, J. W. Mauchly, 
Margaret Merrell, E. B. Mode, Frederick Mosteller, C. M. Mottley, Doris Newman, R. H. 
Noel, H. W. Norton, Otis Pope, C. J. Rees, C. F. Roos, P.'J. Rulon, J. W. Tukey, W. M. 
Upholt, F. M. Wadley, C. L. Weaver, C. P. Winsor, W. J. Youden. 


At the morning session, a joint session with the Biometrics Section of the 
American Statistical Association, the following program was presented with 
Professor E. B. Wilson of Harvard University as chairman: 


Topic: The Analysis of Variance in Biology 


Papers: The Assumptions Underlying the Analysis of Variance 

Professor Churchill Eisenhart, University of Wisconsin and The National 
Bureau of Standards 

Some Consequences when the Assumptions are not Satisfied 

Professor W. G. Cochran, North Carolina State College 

The Use of Transformations 

Professor M. S. Bartlett, Cambridge University and the University of 
North Carolina 


Discussion: Professor Boyd Harshbarger, Virginia Polytechnic Institute 
Dr. W. C. Jacob, Long Island Vegetable Research Farm 
Professor C. P. Winsor, Johns Hopkins University 
Dr. W. J. Youden, Boyce Thompson Institute 


The program for the afternoon session, also a joint session with the Biometrics 
Section, under the chairmanship of Dr. E. J. DeBeer, Wellcome Research 
Laboratories, was as follows: 


Topic: The Analysis of Variance in Biology (continued) 

Papers: The Analysis of Covariance 
Professor D. B. DeLury, Virginia Polytechnic Institute 
Discriminant Functions 
Professor George W. Brown, Iowa State College 


Discussion: Professor W. D. Baten, Michigan State College 
Professor C. I. Bliss, Yale University 
Mr. W. A’. Hendricks, U.S. Department of Agriculture 


P.S. Dwyer, 
Secretary. 











ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 1946 


New OPpportunItTIEs 


The return to peacetime conditions presents the Institute with new oppor- 
tunities for expanding its activities and usefulness. An increased appreciation 
for mathematical statistics has followed the many contributions made by our 
members to the war effort. The numerous societies interested in specific appli- 
cations of statistics have come to look to the Institute both for leadership in 
theory and for playing its part in the dissemination of new results. As a result of 
the drastic interruption in the normal training of students during the war, there 
is unusually keen competition for the services of capable statisticians. Those of 
our members who are engaged in teaching are responsible for the execution of a 
vigorous training program to meet current and future demands promptly and 
without sacrifice of quality. In short, we are in a position, as never before, to 
advance the development and efficient use of mathematical statistics. The fol- 
lowing account of some of our activities during the year will indicate, I believe, 
that the record is creditable. Yet in many instances what has been accomplished 
is only a beginning. 


MEETINGS 


The Development Committee has repeatedly stressed the desirability of an 
extension in our customary schedule of meetings in order to provide additional 
contacts between mathematical statisticians and the users of statistics. Owing 
to the greater availability of railway and hotel accommodation in 1946, we ob- 
tained our first opportunity to put this extension into effect. The regular winter 
meeting with the American Statistical Association and other social science or- 
ganizations was resumed at Cleveland in January, while the late summer meeting 
with the mathematicians took place at Cornell in September. In addition, two 
meetings were held with different sections of the American Association for the 
Advancement of Science, at St. Louis in March and at Boston.in December. 
On both occasions the programs were expository and attracted large audiences. 
Finally, at the invitation of Princeton University, a one-day meeting at Princeton 
in November was devoted to the analysis of variance. While no joint sessions 
were conducted with engineering or industrial societies, several of our members 
took prominent parts in the programs of such societies. 

For the near future, it seems desirable to continue the practice of meeting in 
the winter with the ASA and social science groups and in the summer with the 
mathematical groups. In 1947 these meetings will be at Atlantic City, January 
24-27 and at Yale, September 1—5 respectively. It is not known whether con- 
ditions in future years will produce a return to Christmas rather than January 
meetings: for the present the hotel situation swings the balance in favor of 
January. 


150 


| —" a ~~ pe -_ 


REPORT OF THE PRESIDENT 151 


In 1946 the membership of the program committee was enlarged so that it 
would be better equipped to arrange joint meetings with other societies. We 
owe our thanks to the members for their successful efforts in the face of difficulties 
which still attend the planning of a meeting. 


ANNALS 


Despite the scarcity of manuscripts in the later stages of the war, our editor, 
Professor S. 8S. Wilks, succeeded throughout in maintaining the annual volumes 
of the Annals at their usual size. During 1946, scarcity gave way to plenty. 
The number of papers of good quality submitted in recent months is sufficiently 
great that there will be more than enough, by current estimates, to fill the 1947 
volume. To narrow the scope of the Annals or to reject good papers would be 
undesirable. Accordingly, the Directors have authorized an increase of 100 
pages in the 1947 volume if this is necessary to insure the publication of all ac- 
ceptable papers. 

A gratifying testimony to the prominence of the Annals in its field is the marked 
increase in the demand for back numbers. Our Secretary-Treasurer reports 
that sales amounted to $3,235. To meet actual or anticipated orders, eleven 
issues were reprinted during 1946 at a cost of $2,809. 

For most members of the Institute, even those who serve on the Board, work 
on Institute affairs occupies only a minor portion of our time. The editor is 
never free from some forthcoming publication deadline. Initial perusal of manu- 
scripts, selection of referees, editorial decisions, handling of the production phases 
of publication and much miscellaneous correspondence (not all of it pleasant) 
make editorial work a daily preoccupation, year in and year out. An annual 
word of thanks is an inadequate expression of our indebtedness to Professor 
Wilks. 


MEMBERSHIP AND FINANCE 


At the beginning of 1945 there were 606 members. A year later this figure 
had increased to 777 and at the end of 1946 the figure stood at 900. A fifty per- 
cent increase in two years is another evidence of the healthy growth of the Insti- 
tute. It has been attained to a considerable extent through the hard workof 
our Secretary-Treasurer, P.S. Dwyer and the cooperation of individual members. 

The Secretary-Treasurer also reports a very satisfactory net gain in assets 
of $2,627 during the year. Nevertheless, financial problems may arise in the 
near future. Printing and other costs have risen sharply, and the printing of an 
enlarged Annals will be an additional drain on our resources. Both the Member- 
ship and Development Committees have given some thought to the need for 
additional revenue that may face us soon. They have recommended considera- 
tion of the possibility of Institutional Memberships, a device that has been found 
satisfactory by some other societies. A continued growth in membership will 
also help greatly to finance expanded activities. 





REPORT OF THE PRESIDENT 


CoMMITTEES 


Inter-society affairs: The report of the 1944 Committee on Development, stress- 
ing the need for closer cooperation amongst the various societies interested in 
statistics, provided the stimulus for active efforts in this direction. A meeting 
of representatives of these societies was called early in 1945 at the invitation of 
the American Statistical Association. This meeting suggested that a reconsti- 
tution of the ASA might enable it to become the central binding organization. 
Accordingly , a committee of the ASA has worked for a considerable time on a 
revision of the ASA constitution, which it is intended to submit to the votes of 
ASA members early in 1947. The new constitution provides for representation 
from other societies on the Council of the ASA, should these societies decide to 
associate or affiliate with the ASA. 

From our own point of view, it has seemed wise to delay action on certain 
internal affairs while awaiting the outcome of these developments in the ASA. 
Thus a statement of policy with regard to the formation of chapters of the TMS 
is needed and the problem has been considered both by a special committee in 
1945 and by the Development Committee in 1946. The latter committee recom- 
mends that no decision be made pending examination of the provisions for joint 
sponsorship of local and regional chapters in the new ASA constitution. Simi- 
larly, our own Committee on Revising the Constitution and By-Laws has de- 
ferred a final report until the attitude of our members towards the new develop- 
ments can be expressed. It is to be hoped that decisions can be taken in 1947, 

Tabulation: The advances made in recent years in the construction of new types 
of computing equipment justified an enlargement of our Committee on Tabula- 
tion, which now includes experts both on the building of machines and on the 
calculation and use of tables. The committee plans to keep our members in- 
formed of progress in this field. 

Government Service: Dr. W. Edwards Deming served as chairman of a new com- 
mittee on Mathematical Statistics and Statisticians in the Government Service. 
Although the federal government employs many mathematical statisticians, 
explicit recognition of the profession is lacking in many instances. As has hap- 
pened in other fields, statisticians are sometimes officially classed as economists 
and little provision is made for mathematical statisticians in recruitment policies. 
Moreover, it is probable that a number of branches of the government, at present 
unaware of the functions of a statistician, could employ several with profit. 
The new committee will endeavor to insure that mathematical statistics is recog- 
nized and effectively utilized in the federal service. 

Assistance to libraries: Like other professional societies, the Institute has re- 
ceived a number of appeals from libraries in war areas whose periodicals were 
looted or destroyed during the war. After careful consideration, the Board 
decided that official action should be limited to the free provision of missing 
copies of the Annals to all former subscribers who intend to renew subscriptions 
for the future. In addition, a committee with Professor J. Neyman as chairman 





REPORT OF THE PRESIDENT 153 


was appointed to establish a procedure by which gifts of individual members 
(books, reprints, back numbers of the Annals or cash for the purchase of back 
numbers) could be handled. At the suggestion of this committee a general 
appeal for the small sum of 50 cents per member was circulated with the Decem- 
ber billing. Individual collections are also being made at certain centers. 

Teaching: The Committee on Teaching has not made as much progress as it 
would have liked, owing to the dispersal of its members and the taking up of new 
civilian posts. Members have, however, cooperated with the Committee on 
Applied Mathematical Statistics of the National Research Council, which is 
engaged on a somewhat similar survey. 

Rietz lecture: The first lecturer in the new series of lectures in honor of the late 
Henry Lewis Rietz will be Professor A. Wald. His topic will be “Sequential 
Estimation and Multi-Decisions”. The lecture will be delivered in connection 
with the Yale meetings, September 1947. 

Representatives: In addition to its committee work, the Institute cooperates, 
through representatives, with the Division of Physical Sciences of the National 
Research Council, the Joint Committee for the Development of Statistical Ap- 
plications in Engineering and Manufacturing, the American Association for the 
Advancement of Science, the Inter-Society Committee on Federation and the 
Policy Committee for Mathematics. The last committee, which was appointed 
in 1946, will consider important problems that affect the mathematics profession 
as a whole. 

Nominations: The Committee on Nominations, consisting of Professor P. R. 
Rider (chairman), Professor B. H. Camp and Professor G. M. Cox, has made the 
following nominations for officers in 1947. 


President: W. Feller 
Vice-Presidents: J.H. Curtiss 
M. H. Hansen 
Secretary-Treasurer: P. 8S. Dwyer 


While it is perhaps improper to comment on nominations, I should like to 
express my personal appreciation of Professor Dwyer’s action in being willing 
to offer himself for re-nomination as Secretary-Treasurer. The successful opera- 
tion of the Institute rests mainly on the Secretary-Treasurer, and the demands 
of the Office are even more continuous and exacting than those on the editor. 
Professor Dwyer’s splendid work during his first three years of office, carried on 
at considerable sacrifice of his research interests, deserves the best thanks and 
appreciation of every member. 

In conclusion, it is a pleasure to express my sincerest thanks to all committee 
chairmen and members and to all representatives for their excellent work for the 
good of the Institute, and to all Institute members for their loyal support. 

W. G. CocHRAN, 
President, 1946. 











154 REPORT OF THE PRESIDENT 


Committees of the Institute 


Committee 
Development 


Membership 


Program 


Mathematical Statistics and 
Statisticians in the Govern- 
ment Service 


Revising the Constitution- 


and By-Laws 


Tabulation 


Teaching 


Nominations 


Finance 


Subscription to Purchase An- 
nals for Countries Devas- 
tated by War 


Society 
Inter-Society Committee on 
Federation 


Policy Committee for Mathe- 
matics 


Personnel 


E. G. Olds (chairman), C. I. Bliss, M. A. 
Girshick, F. C. Mosteller, P. 5. Olmstead, 
H. Scheffé. 


W. Feller (chairman), C. C. Craig, P. A. 
Horst, T. Koopmans 


J. H. Curtiss (chairman), M. Friedman, B, 


Harshbarger, W. N. Hurwitz, A. M. Mood 
F. C. Mosteller, J. W. Tukey 


? 


W. E. Deming (chairman) 


M. H. Hansen (chairman), C. I. Bliss, A. T. 
Craig, J. H. Curtiss, W. Shewhart 


C. Eisenhart (chairman), P. S. Dwyer, H. 
Goldstine, A. N. Lowan, H. W. Norton, G. R, 
Stibitz 


H. Hotelling (chairman), W. Bartky, W. E. 
Deming, M. Friedman 


P. R. Rider (chairman), B. H. Camp, G. M. 
Cox 


P.S. Dwyer (chairman), L. A. Knowler, C. F. 
Roos, F. F. Stephan 


J. Neyman (chairman), W. Feller, P. L. Hsu 


Representatives 


J. H. Curtiss, P. S. Olmstead 


W. Feller 


Joi 


REPORT OF THE PRESIDENT 155 


Society 
Joint Committee for the De- 
velopment of Statistical 
Applications in Engineer- 
ing and Manufacturing 


American Association for the 
Advancement of Science 


Division of Physical Sciences, 


NRC 


Representatives 


F. C. Mosteller, S. S. Wilks 


G. W. Snedecor 


W. Bartky 











REPORT OF THE SECRETARY-TREASURER OF 
THE INSTITUTE FOR 1946 


The Institute of Mathematical Statistics held five meetings during 1946, at 
Cleveland on January 24-27, at St. Louis on March 30, at Ithaca on August 
22-23, at Princteon on November 1, and at Boston on December 28. 

The large number of meetings has necessitated frequent mailings to the 
membership. Memoranda to members, with appropriate enclosures, were sent 
out in January, March, June, July, October, and November. 

The Secretary-Treasurer wishes to acknowledge the cooperation of the mem. 
bers of the Institute in paying bills promptly, in considerable activity leading to 
an increase in membership, and in general looking after the interests of the In- 
stitute. 

At the beginning of 1946 the Institute had 777 members. During the year 
180 new members joined the Institute, an increase of 23%. However, during 
1946 the Institute lost 57 members. Of these, 15 resigned, 37 were dropped for 
non-payment of dues, and 5 are deceased. Some of the 37 dropped we have 
been unable to contact, and it is very probable that, in some cases, membership 
will be resumed in the future. The net increase in members during the year was 
123, or about 16%, making a total of 900 members. 

The following members died during the year: 


Professor O. F. Banos 
Professor 8S. A. Cudmore 
Professor Dunham Jackson 
Dr. Walter F. Schilling 
Professor C. C. Wagner 


The office of the Secretary-Treasurer sent a reprint of an Annals article and 
information about the Institute to 1800 persons interested in Quality Control. 
At least 28 of the new members became members as a result of this drive. Asa 
continuation of a campaign started in 1945, the Institute also sent literature 
about the Annals to several hundred libraries and laboratories. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in caring for the back issues of the Annals which are 
stored at Iowa City. 

A few comments about the financial statement which appears below are in 
order. In addition to the increase in membership, mentioned above, the chief 
rise in income resulted from the unprecedented sales in back issues which 
amounted to $3,234.88, an increase over the preceding year (the previous high) 
of 86%. These heavy sales, however, depleted the supplies of many of our early 
issues, so that we were forced to reprint eleven of these issues and also the cumula- 
tive index during the year. This cost $2,809.00 (for 500 copies of each) and in- 
dicates that a much larger portion of our assets is in inventory, as shown in Ex- 
hibit D. 


156 





was 


shit 
tot 


is € 





ust 


he 
nt 








REPORT OF THE SECRETARY-TREASURER 157 


Following the instructions of the Finance Committee, Professor H. C. Carver 
was paid for his share of all issues in which he and the Institute had joint owner- 
ship. 

Nine members have paid life memberships during the year, increasing the 
total of life membership funds by $812.50. 

The net gain in assets of $2,627.23 is very satisfactory even though this gain 
is evident in increased inventory and not in a better cash position. 


FINANCIAL STATEMENT 
December 31, 1945, to December 31, 1946 


A. REcEIPTS 





Batancz on Hamp, Ducmmmur 31, 1046.......... 6 ccc ccc ccc cece ceccececs $7,548.22 
Naar sre asec ay rege eae te ess taew ahaa ral AA docs ers cerca attend races 4,638.40 
amen DUSTER TF AWMUEIN ES. og coo. sok sb iS KG wae RSW ONS AGA ew RARERAN SE 812.50 
SMM RM MNRNEE 525 02a sre! 5 S13. 4 ails als ycteaais GNSS Wl VES wie WEN wean Mae ORS 2,057 .54 
ini dnd dninidinke debe ks kn ikwkd ke iewaeas .... 9,234.88 
eee re 150.00 
NII eon east nC hts Sinn shen pu nsiaepedsgnincas ote tees Maier ate rale BIS 121.29 

ee re a ease cre aon nels tons ihc, eat in Ae aaa .»-. $18,562.83 


ANNALS—CURRENT 





URRY hic eS oh Li i Pi a al ees $125.00 
I i bh het sori cabin A sali. Sohal as cteia cc can Io als 4,566.27 
$4,691.27 
ANNALS—Back NUMBERS 
Peeremee Grom E. , COME ink eis oss visikid esc coedenesawdecddsnaew 644.50 
NE ONIN ooo ee ise reese nenrieriwionim edo earmieetews 2,809.00 


Vol. I #1, IT *2, II #3, III #3, IV #1, VII #3, VII *2, VIII 
#1, 2, 3, 4, Cumulative Index 


ieee s bee ak ita nudihed Mae iudeeencaews 41.46 
SRN pees so Uns csis & Siag oats Dad eh DEN SNS eit it 68.00 


3,562.96 
Ce ee a re 25.62 
Dee ENEE AAs OUR ENIW Soo cisco. bS ik kei as S ae owewe bo bea de eee kwhenk a 100.00 


OFFICE OF THE SECRETARY-TREASURER 
Printing, Mimeographing, programs, etc. (including stamped 


NINE oio6s ec in arn eo See eines abo ak Stes ewe wieraae Rib Oe wa IOE IS $967.14 
Printing 1800 copies of Wald-Wolfowitz article.................. 140.00 
Postage and supplies....... Toe er IN ata re DR nlite pedpiinthee 375.00 
ee ee eT 1,420.25 


bo 
= 
Le) 
w 
o 


MIscELLANEOUS 
BaLaNce ON Hanp, DeceMBER 31, 1946 (Cash and Bonds).................. 











158 REPORT OF THE SECRETARY-TREASURER 


C. Summary OF RECEIPTS AND EXPENDITURES 


BALANCE ON Hanp,* DECEMBER 31, 1945 


Fee PE GD ies a Soh omei ce Ste MSN ah oie ie $7 , 548.29 
RECEIPTS DURING 1946 : 


Rte ERE Teen tence eh any ect hen horde miata aremtacanaieaomc meen 11,014.6] 
PO RING INU WCRI NG TOD 6o ooo cio e heidi oes Gia cclc cbse ea beviccsswacieevesswewe 11,321.28 
DALANCE ON TAAND,” DECEMBER 31, 1946... .... ooo... 55 cece ceca eceacedeccesaas 7,241.55 


D. ComPARISON OF ASSETS ON DECEMBER 31, 1945 AND DECEMBER 31, 1946 


1946 1946 
Ee Oe $6,000.00 $5,000.00 
‘ ; 888 .00 1888.00 Bonds 

MiTe WICMIDOLANID PURGE. 5.5 oo ood wick sacs cige'ecce Soci 397 00 139.50 Bank Dep, 
Additional Bank Deposits... .........cccccccseccceces 333.22 214.05 
Current Accounts Receivable.....................0008- 255 .35 452.62 
Estimated Value (Cost) of back issues of Annals....... 4,497.95 7,234.58** 

RNIN ries er SA Neeser 5 aeons Ramlae hoe $12,301.52 14,928.75 


at hie lnann oatenteaeia ee ean 2,627.23 


EB. LiaBiuities oF INstiTuTE OF MATHEMATICAL STATISTICS AS OF DECEMBER 31, 1946 


All bills which have been presented have been paid and there are no outstanding accounts 
against the Institute. The $2027.50 in Life Membership payments require the Institute to 
provide the privileges of membership for life for the 26 members who have made payments, 
Also, $2686.71 should be credited to 1947 dues and subscriptions. 


Pau 8. DwYErR 


Secretary-T reasurer. 
December 31, 1946 





* In form of bank deposit and government bonds. 
** Value of Annals calculated at 67 cents per copy, and based on physical inventory$ 





5 





ANNUAL REPORT OF THE EDITOR FOR 1946 


During 1946 there was a considerable increase in the number of manuscripts 
submitted to the Editorial Committee of the Annals. A total of 49 papers in- 
cluding 18 short notes were published in the 1946 volume of the Annals. The 
publication of these papers together with various official reports of the Institute 
and the Directory of the Institute required a total of 555 pages. Plans are al- 
ready under way to expand the 1947 volume of the Annals to 600 pages. 

During recent years there has been a very noticeable broadening of interest 
in the field of probability and statistical theory on the part of readers and con- 
tributors to the Annals. Contributors to the 1946 volume came from university 
departments of astronomy, biology, mathematics, sociology and statistics; from 
Army, Navy and other government groups; and from industrial laboratories and 
quality control departments. More recently, contributions have been received 
from physicists, chemists and other groups. More contributions are now being 
received from overseas than in previous years. Every effort is being made to 
keep the Annals balanced with respect to these various directions of interest in 
probability and statistical problems. It is believed that one of the most effec- 
tive things which could be done for the readers of the Annals is to publish ex- 
pository articles from time to time on new fields of development in probability 
and statistical theory. Invitations have been accepted by several individuals to 
prepare such articles. 

Dr. Thornton C. Fry has asked to be relieved from the Editorial Com- 
mittee, as of January 1, 1947. The Editor wishes to take this opportunity to 
express his gratitude for the service which Dr. Fry has rendered in connection with 
the editorial work on the Annals during the past nine years. 

On behalf of the Editorial Committee for the Annals, the Editor wishes to 
acknowledge with thanks the refereeing assistance which has been provided by 
the following persons during 1946: R. L. Anderson, T. W. Anderson, David 
Blackwell, Z. W. Birnbaum, K. L. Chung, W. J. Dixon, J. L. Doob, M. A. 
Girshick, T. E. Harris, L. Henkin, M. Kae, Irving Kaplansky, Bradford F. Kim- 
ball, T. Koopmans, H. Levene, H. B. Mann, P. J. McCarthy, F. C. Mosteller, H. 
E. Robbins, D. F. Votaw, J. E. Walsh and C. P. Winsor. The Editor is also 
indebted to the following individuals at Princeton University for preparation of 
manuscripts for the printer, and other editorial assistance: Mrs. Gladys B. Huling, 
Mrs. Eleanor C. Schoenly and J. E. Walsh. 

5. S. WiLks 
Editor. 
December 31, 1947 








CONSTITUTION 
OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


ARTICLE I 
NAME AND PURPOSE 


1. This organization shall be known as the Institute of Mathematical Statistics. 
2. Its object shall be to promote the interests of mathematical statistics. 


ARTICLE II 
MEMBERSHIP 


1. The membership of the Institute shall consist of Members, Fellows, Honorary 
Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all others, Junior 
members excepted, who have been members for twenty-three months prior to the date 
of voting. 

3. No person shall bé a Junior Member of the Institute for more than a limited term as 
determined by the Committee on Membership and approved by the Board of Directors. 


ARTICLE III 


OFFIcERS, BOARD OF DIRECTORS, AND COMMITTEE ON MEMBERSHIP 


1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secre- 
tary-Treasurer. The terms of office of the President and Vice-Presidents shall be one 
year and that of the Secretary-Treasurer three years. Elections shall be by majority 
ballots at Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the 
individuals present at the organization meeting, and shall serve until December 31, 1936. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3. The Institute shall have a Committee on Membership composed of a Chairman and 
three Fellows. At their first meeting subsequent to the adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve as the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec- 


tion for a term of three years. The president shall designate one of the Vice-Presidents as 
Chairman of this Committee. 


ARTICLE IV 
MEETINGS 


1. A meeting for the presentation and discussion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held annually at such 


160 


BY-LAWS 161 


time as the Board of Directors may designate. Additional meetings may be called from 
time to time by the Board of Directors and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the 
date set for the meeting. All meetings except executive sessions shall be open to the 
public. Only papers accepted by a Program Committee appointed by the President may 
be presented to the Institute. 

2. The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board 
may be held from time to time at the call of the President or any two members of the 
Board. Notice of each meeting of the Board, other than the two regular meetings, 
together with a statement of the business to be brought before the meeting, must be 
given to the members of the Board by the Secretary-Treasurer at least five days prior to 
the date set therefor. Should other business be passed upon, any member of the Board 
shall have the right to reopen the question at the next meeting. 

3. Meetings of the Committee on Membership may be held from time to time at the call 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have the right to reopen the question at the 
next meeting. Committee business may also be transacted by correspondence if that 
seems preferable. 

4, At a regularly convened meeting of the Board of Directors, four members shall 
constitute a quorum. At a regularly convened meeting of the Committee on Member- 
ship, two members shall constitute a quorum. 


ARTICLE V 


PUBLICATIONS 
1. The Annals of Mathematical Statistics shall be the Official Journal for the Institute. 
The Editor of the Annals of Mathematical Statistics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated 
at the discretion of the Board of Directors. 
2. Other publications may be originated by the Board of Directors as occasion arises. 


ARTICLE VI 
EXPULSION OR SUSPENSION 


1. Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Directors with not more than one negative vote. 


ARTICLE VII 


AMENDMENTS 


1. This constitution may be amended by an affirmative two-thirds vote at any regu- 
larly convened meeting of the Institute provided notice of such proposed amendment 
shall have been sent to each voting member by the Secretary-Treasurer at least thirty 
days before the date of the meeting at which the proposal is to be acted upon. Voting 
may be in person or by mail. 











162 INSTITUTE OF MATHEMATICAL STATISTICS 


BY-LAWS 
ARTICLE I 


Doties oF THE OFFICERS, THE EpiTor, BoarD OF DIRECTORS, AND 
CoMMITTEE ON MEMBERSHIP 


1. The President, or in his absence, one of the Vice-Presidents, or in the absence of the 
President and both Vice-Presidents, a Fellow selected by vote of the Fellows present, 
shall preside at the meetings of the Institute and of the Board of Directors. At meetings 
of the Institute, the presiding officer shall vote only in the case of a tie, but at meetings 
of the Board of Directors he may vote in all cases. At least three months before the date 
of the annual meeting, the President shall appoint a Nominating Committee of three 
members. It shall be the duty of the Nominating Committee to make nominations for 
Officers to be elected at the annual meeting and the Secretary-Treasurer shall notify all 
voting members at least thirty days before the annual meeting. Additional nomina- 
tions may be submitted in writing, if signed by at least ten Fellows of the Institute, up to 
the time of the meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the proceedings 
at the meetings of the Institute and of the Board of Directors, send out calls for said 
meetings and, with the approval of the President and the Board, carry on the corre- 
spondence of the Institute. Subject to the direction of the Board, he shall have charge 
of the archives and other tangible and intangible property of the Institute and once a year 
he shall publish in the Annals of Mathematical Statistics a classified list of all Members and 
Fellows of the Institute. He shall send out calls for annual dues and acknowledge receipt 
of same; pay all bills approved by the President for expenditures authorized by the Board 
or the Institute; keep a detailed account of all receipts and expenditures, prepare a finan- 
cial statement at the end of each year and present an abstract of the same at the annual 
meeting of the Institute after it has been audited by a Member or Fellow of the Institute 
appointed by the President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the responsi- 
bility for all editorial matters concerning the editing of the Annals of Mathematical Sta- 
tistics. Heshall, with the advice and consent of the Board, appoint an Editorial Commit- 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap- 
pointments to be made annually as needed. All appointments to the Editorial Com- 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorized 
by the Board. 

4. The Board of Directors shall have charge of the funds and of the affairs of the 
Institute, with the exception of those affairs specifically assigned to the President or to 
the Committee on Membership. The Board shall have authority to fill all vacancies 
ad interim, occurring among the Officers, Board of Directors, or in any of the Committees. 
The Board may appoint such other committees as may be required from time to time 
to carry on the affairs of the Institute. The power of election to the different grades of 
Membership, except the grades of Member and Junior Member, shall reside in the Board. 

5. The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the 


dif 


r 


me 








BY-LAWS 163 


different grades of membership. The Committee shall review these qualifications period- 
ically and shall make such changes in these qualifications and make such recommendations 
with reference to the number of grades of membership as it deems advisable. The power 
to elect worthy applicants to the grades of Member and Junior Member shall reside in the 
Committee, which may delegate this power to the Secretary-Treasurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make recom- 
mendations to the Board of Directors with reference to placing members in other grades 
of membership. The Committee shall give its attention to the question of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 


ARTICLE II 


DvEs 


1. Members shall pay five dollars at the time of admission to membership and shall 
receive the full current volume of the Official Journal. Thereafter, Members shall pay 
five dollars annual dues. The annual dues of Junior Members shall be two dollars and 
fifty cents. 

The annual dues of Fellows shall be five dollars. The annual dues of Sustaining 
Members shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception. In the case that two Members of the Intitute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception. Any Member or Fellow may make a single payment which will be 
accepted by the Institute in place of all succeeding yearly dues and which will not other- 
wise alter his status as a Member or Fellow. The amount of this payment will depend 
upon the age of this Member or Fellow and will be based upon a suitable table and rate of 
interest, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except as a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service. He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the 
first year of his resumed regular membership he may have the right to purchase, at $2.50 
per volume, one copy of each volume of the Official Journal published during the period 
of his service membership. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow, Member, or Junior Member include a subscription to 
the Official Journal. The annual dues of a Sustaining Member include two subscriptions 
to the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article. 
If such person fail to pay such dues within three months from the date of mailing such 
notice, the Secretary-Treasurer shall report the delinquent one to the Board of Directors, 
by whom the person’s name may be stricken from the rolls and all privileges of member- 
ship withdrawn. Such person may, however, be re-instated by the Board of Directors 
upon payment of the arrears of dues. 





BY-LAWS 


ARTICLE III 
SALARIES 


1. The Institute shall not pay a salary to any Officer, Director, or member of any i 
committee. 


ARTICLE IV 


AMENDMENTS 


1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend- 
ment has been previously approved by the Board of Directors. 








